373 


374 


















Vol. XXXIV. Parts I and II January, 1947 


BIOMETRIKA 


A JOURNAL FOR THE STATISTICAL STUDY OF 
BIOLOGICAL PROBLEMS 


FOUNDED BY 
W. F. R. WELDON, FRANCIS GALTON ann KARL PEARSON 


EDITED BY 
EGON S. PEARSON 


IN CONSULTATION WITH 


HARALD CRAMER J. B.S, HALDANE 
R. C. GEARY G. M. MORANT 
MAJOR GREENWOOD JOHN WISHART 


ISSUED BY THE BIOMETRIKA OFFICE 
UNIVERSITY COLLEGE, LONDON 


AND PRINTED AT THE 
UNIVERSITY PRESS, CAMBRIDGE 


Reprinted by offset-litho 1964 








[Issued 11 February 1947] 














Votume XXXIV, Parts I anp II JANUARY 1947 





THE VARIANCE OF THE OVERLAP OF GEOMETRICAL FIGURES 
WITH REFERENCE TO A BOMBING PROBLEM 


By F. GARWOOD, Pu.D. 


1. INTRODUCTION 
The present paper deals with a particular problem arising in the mathematical study of 
bombing. Briefly,-the general problem is that of predicting the over-all effects of a bombing 
attack carried out under given conditions against a given target, and the mathematical 
treatment involves various simplifying assumptions concerning these conditions. 

In the type of problem considered here, attention is centred on the total plan area of 
damage caused to a single building by bombs falling independently and at random over a 
larger area containing the building. It is assumed that each bomb damages all that part of 
the building contained within a circle of fixed size centred at the bomb (a square damage 
area is also considered), while the building has a simple plan outline, such as a rectangle or 
a circle. The area of damage of two or more adjacent bombs is merely the area covered by 
the circles. The theoretical problems dealt with are those of estimating the variance of the 
amount of damaged area (the estimation of the mean or expected damage presents little 
difficulty). It would be more satisfactory to obtain the complete frequency distributions, 
but this has so far not been achieved, nor has it been possible to obtain explicit formulae for 
the 3rd and 4th moments. 

As there may be applications of the problems to fields completely different from those of 
bombing studies, and as they are problems which involve essentially the concepts of geometry 
and of probability, it is convenient to express them entirely in these terms. 

We thus have problems of the following type. A number of circles are placed at random on 
a plane so that each one has some or all of its area inside a fixed square. What are the mean 
and variance of the area of the square covered by the circles? The fundamentals of this type 
of problem have been studied by Robbins (1944), and Bronowski & Neyman have 
dealt with another particular case.* Robbins’s results enable us to deal with geometrical 
figures other than circles and squares, and also to deal with cases where the number, 
position and orientation of the ‘covering’ figures follow probability laws other than the 
simple ones implied in the above example. 


2. RoBBINS’s THEOREM 
In leading up to his theorem, Robbins uses the concept of a random measurable subset X 
of n-dimensional Euclidean space Z,. He defines the function g(a, X) for every point x 


of H,, and for every X as equal to 1 for xe X and zero elsewhere. This theorem is then as 
follows: 


Let X be a random Lebesgue measurable subset of E,,, with measure w(X). For any point 
x of E,, let p(x) = Pr(xe X). Then, assuming that the function g(x, X) is a measurable function 
of the pair (x, X), the expected value of the measure 7 X will be given by the Lebesgue integral of 
the function p(x) over E,,. 


* Note by Editor. This paper was received for publication in September 1945; Dr Garwood has 
asked me to add the following note in proof. ‘‘The author had the privilege of seeing the work of 
Bronowski & Neyman in proof. This paper was then submitted, after which their work was published 
(1945) together with a second article by Robbins (1945), who has solved, among others, some of the 
problems dealt with in this paper, as acknowledged in later footnotes.” 


Biometrika 34 I 











2 The variance of the overlap of geometrical figures 


Robbins generalizes this result to obtain the mth moment of the measure of X; this is the 
integral of the function p(z,, 2, ...,%,) over E,,,, where 
P(X, Xe, ...5%m_) = Prix eX and x,€X... and z,,¢€ X). (1) 
It is useful to give a simple non-rigorous proof of this result. Suppose the space EF, to be 
divided into an enumerably infinite set of small elements w,,w., .... If we assume that any 
particular subset X can be made up of a selection of the w’s, then 
W(X) = Ayo, +Az,@2+..., 


where w, is here used also as the measure of the element w,, and where the A’s, appropriate 
to this particular X, are 0 or 1. Hence 


{u(X)}" = DD...A,Ay--- Op 
P@ 


pares 
where each summation is over the whole of EZ,,, and there are m such summations. Thus 
exp {u(X)}" = > >... @,w, ... exp (A,A,...). 
pd 


But the expectation of Ay A, ... is the probability that the elements w,,,,... are in X, and 
on proceeding to the limit the desired result is obtained. 

The verification of this result in the case, say, of the 2nd moment of a linearly distributed 
variate, is instructive. Thus suppose 2 is a variate with a probability function F(z), i.e. the 
probability of obtaining a value <z is given by the measurable function F(x), where 

F(-—o)=0 and F(o)=1. 
Define X as the interval from 0 to x; then the expectation of the square of the measure of 
X is the 2nd moment of x. To use Robbins’s theorem we used 
the probability p(x,,x,) that a given pair of values x, and 2, 




















4 

both lie in the interval 0, x chosen at random. Using co- . B 
ordinate axes Ox,, Ox,, this probability is zero in the 2nd and 
4th quadrants, since O,x cannot contain two points x, and 2, A 
of opposite signs. In the region A (see Fig. 1), where x, > 2, > 0, 
the two values x, and 2, are both in O,2 if x>2,, and the ‘e) x 
probability of this is 1— F(x,). Thus in A, Cc 

P(X, %_) = 1— F(x,). b 
Similarly in B P(x1,%_) = 1— F(x), 
while in C P(X, %_) = F(x) Fig. 1. 
and in D P(x1,X%_) = F(x,). 


The integral of p(x,,x,) over A is seen to be 


ie — F(x,))dx,, 


while the total integral of p(x,, x.) over the whole plane is 
0 


2| “a{1— Fe)}de—2 | xF (x) dx. 
0 — 
A single integration by parts then leads to 


| *_ dF), 


which is the 2nd moment of x, as required. 











F. Garwoop 3 


3. APPLICATION OF ROBBINS’S THEOREM TO OVEKLAP PROBLEMS 


We shall be concerned with cases in E, where the subset X is the part of a fixed area A in the 
plane which is covered by a number of areas C dropped independently and at random on the 
plane. We suppose A to be the interior of a simple closed curve, while each C is the interior 
of another curve. The area C has a reference point Q (conveniently called its centre) and a 
reference line, and it is assumed that there is a frequency distribution (2, y,9) of the 
position (x,y) of Q and of the inclination 0 of the reference line to a fixed direction in Fj. 
(x, y, 8) can be assumed to be zero outside an area 7’, i.e. the points Q are distributed inside 
T. (In the applications the angle 0 will be constant and the areas C will be equally likely to 
fall anywhere over F, so that we can write ¢(x, y,@) = 1/7'.) Another chance variable is k, 
the number of areas C; its distribution can be defined by the series p, 71, Po, .--, Py» ++ 
(which are the probabilities of 0,1,2,...,4,... C’s falling on 7’), or by the probability 
generating function G(u), where 

G(u) = pot PyUt pou? +... + ppu*t.... (2) 
Finally, it is more convenient to consider the moments of the area Y = A —X, i.e. the area 
of A (we can use the symbols Y, etc. for either the sets or their areas) not covered by the C’s. 
Evidently the variances of X and Y are equal. 

To obtain the lst moment of Y, we reed first: the probability p(x,, y,) that a point (x,, y;) 
of A will belong to Y, i.e. of (x,, y,) not being covered by a C..Now (2, y;) will not be covered 
by a particular C falling at an inclination 6 if the centre Q(z,y) falls outside an area 
©(x,,y,,9) obtained by centring the C at (x,,y,) and rotating it through 180°. If the part 
of 7' exterior to this area is called 7'— O(a, y;,9), and if we allow all inclinations, the prob- 
ability of this occurring is : 

evan) = || [02.4.0 dedyao, (3) 
T—Cia,, tr, 4) 


If k C’s are dropped independently, the probability is given by q*(x,,y,), so that the total 
probability of (z,, y,) belonging to Y is 


P(X, Ys) = Spates) = G{q(%1,41)}- (4) 
The Ist moment of Y is thus, in the case of k C’s, 
#y(Y) = | fren yi) dx, dy,, (5) 
and in the general case a 
WY) = | [Cena d}aedn (6) 
A 


The 2nd moment is obtained by a similar process; we require the probability p(z,, y,, 72, ¥2) 
that neither of two points (2,, y,) and (22, 2) is covered by a C. Corresponding to these two 
points and an inclination 0 the permissible region in which each centre Q can fall is 


T —E(x,, y,,0)—O(xg, ye, 0)= T—C,—C;. 
Thus for one C the probability is , 
aevvntny)=| | [elev A)dedya, (7) 
giving in the case of k C’s, T-C,-Cy 


P(X 4, Yys La, Yo) = T*(Xy,Y1, Xe, Yq) and p(x, ¥1, Te, Ye) = Giq(21, Yi, Xe, Y2)} 








4 The variance of the overlap of geometrical figures 


in the general case. Thus the 2nd moment 
wD) =f Lf Jarenu emeddedy,deydy, for k 0's (8) 


and” #(Y) = Ltt Giq(21, Yr Ta, Ya)} dx, dy, dx,dy, (9) 


in the general case. In general, for the mth moment, the probability that (2,, y,) ... (%m Ym) 
are not covered by k C’s is 


Q* (24, Yr» Les Yor «++» ms Ym)s 
2n 
where gta tne «1 FmYm) = [| [o(e.y. 0) dedyao, (10) 
5 be 


—C,—C,—...—Cm 
and 7’—C,—C,...—C,, is the area of T' outside C’s centred at (11,4), (22; Ye), 
and rotated through 180°. 


Thus the mth moment is equal to 


—  e 


Lm( Y) = { f- | [eeny Xa, Yas +++) Lm» Ym) Ux, dy, Ary dy, ... AX _AY», 
A A 


in the case of k C’s, or t 


=f fon f [tater te tr: mt} bendy dd deg 
A A 


in the general case. 


(11) 





4. UNIFORM DISTRIBUTION OF COVERING AREAS AT CONSTANT INCLINATION 


As mentioned above, in the cases with which we shall be dealing, the areas C are equally 
likely to fall anywhere over 7’, and the angle @ is constant. The function ¢(z, y, @) can be put 
equal to 1/7 for points of 7' and zero outside; the variable 0, and integration with respect 
to it, may be omitted. 

The function 9(x,, y,) is the fraction of 7’ not covered by a C centred at (x,, y,) and rotated 
through 180°, and in general q(x, ¥;, %2, Yo, ---, Vm» Ym) i8 the fraction of 7’ outside m C’s 
centred at (7, 41), (%2, Yo), ---» (%m> Ym) and rotated through 180°. 

Instead of the variate Y we can consider Y/A, i.e. the fraction of A not covered by k C’s, 
and to obtain its mth moment we divide ,,( Y) by A”. Also the quantity dz, dy, ... dz,,dy,,/A™ 
is the probability of obtaining m centres in the elements of area dz,,dy,, ...,dz,,dy,, of A 
if these centres are uniformly distributed over A. 

We thus obtain the following result from (11): the mth moment of the fraction of A not 
covered by k C’s with their centres falling at random on T is equal to the kth moment of the 
fraction of T not covered by m C’s with their centres falling at random on A and rotated through 
180°. 

In the case k = m = 1 we can express this in a slightly different way if we (i) deal with the 
area common to the two areas concerned, (ii) regard all orientations as possible and as 
equally likely, and (iii) deal with areas rather than fractions. We obtain, in fact, the following’ 
result: the integral of the. overlap of C and A, when the centre of C is taken over T and all 
orientations are permitted, is equal to the corresponding integral of the overlap of C and T’, for 
all positions of the centre of C on A and for all orientations. 

In the practical cases with which we shall deal, the area A is always ‘well inside’ 7’, i.e. 
every point of A can be reached by a C centred somewhere in 7’. In such cases the formula 























F. Garwoop 5 


for the mean overlap is simple; we have m = | and the fraction of T not covered by one C 
is (7'— C)/T', which is constant for all (2,, y,), so that its kth moment is (7'— C)*/T", i.e. 





; T-0\ 
wi(¥/A) = (B=). (12) 
If the number of C’s follows a probability generating function G(u), the mean is given by 
? T-—C 
wi(¥/A) = (=F). (13) 


For the 2nd moment we are concerned with two C’s centred at (z,,y,) and (x2, y,), and if 
their common area is {2(x,, ¥,, 2, ¥2), we have 


T ~20 + Q(ay, ys, Xo, 
U(X; Y15 Xa, Y2) = “i Yr» ay Ya) (14) 





The 2nd moment 3( Y/A) is then the expectation of g* or G(q) for all pairs of points over A, 
and we no longer have a simple formula as in the case of the Ist moment. The overlap 2, 
however, depends on the relative positions of the two C’s, and therefore the number of 
variables in the integration is reduced from 4 to 2 or 1. This is illustrated in the following 
examples. 


5. CIRCLES FALLING ON A FIXED SQUARE 
Assume A to be a square of unit side (i.e. A = 1), C a circle radius a and T' a ‘square with 
rounded corners’, whose boundary is at a distance a outside the sides of A. Thus 
T = 1+4a+70? (15) 
and ; C = na’. (16) 
It is seen that the fraction q of 7' outside the two circles centres (x,, y,) and (72, y,) is a func- 


tion only of the distance r between these points, and can therefore be written as q(r). Hence 
if d(r) is the frequency function of r, we obtain 


v2 

yX) = [ery ocrdr, (17) 
or : v2 

p(X) = | Gfan} geryar, (18) 
The area 2(r) common to two circles radii a with centres distant r apart is 

Q(r) = 2a?(6—sin @ cos8), (19) 
where r=2acos@ (r<2a), (20) 
and Q(r) = 0 (r > 2a), (21) 

_ 14+4a—7a* + Q(r) 

and q(r) = 14+ 4a4+70* ’ (22) 


where {2(r) is given by (19), (20) and (21). 
To obtain the frequency function ¢(r) of r, we note that 
r? = (x, — 2X2)? + (y;— Yo)", (23) 
where 2, X, y, and y, are uniformly and independently distributed in the range 0, 1. The 
difference & == | z,—2,| follows the ‘triangular’ distribution 


df = 2(1—£)dé, (24) 
from & = 0 to 1, so that the quantity 

u = £? =(x,—2,)* 
follows the distribution df = 1 = oe (25) 








6 
Similarly, v = (y,—y,)? follows independently the same distribution 


The variance of the overlap of geometrical figures 





we: =e. (26) 
The distribution of r= J(u+v) 


is obtained by integrating the product 
J(1—-u)J(1—) 
27 
of the frequencies of u and v over that part of the line w+v = r? within the square of unit 


side in which the point u, v can lie, and we obtain without much difficulty for the frequency 
function of r, 





P(r) = 2r(m9—4r+r*) for ,0<r<l, (28) 

and P(r) = 2r(4sin— 1/r+4,/(r?-1)—r?-a-2) for l<r<,/2. (29) 
Thus the 2nd moment of the fraction of the unit square uncovered is given by the integral 
(17) or (18), where g(r) is given by (22), (19), (20) and (21) and ¢(r) is given by (28) and (29). 
It does not appear possible to reduce the integral (17) simply to elementary functions, and 
quadrature must be used. The integrand has discontinuities in its first derivative at r = 1 


and r = 2a, so that the integration must be carried out separately over the intervals with 
these as end-points. 


6. OVERLAP OF CIRCLES ON FIXED RECTANGLE* 


We replace the square A of the previous section by arectangle A; for convenience we assume 
its sides to be ./b and 1/,/b, where b > 1, so that the ratio of the longer to the shorter side is 
b and the area is unity. The centres of the circles radii a are assumed to be equally likely to 


fall anywhere in a ‘rectangle with rounded corners’ 7', whose boundary is at a distance a 
outside A, i.e. 


T = 1+7a? + 2a(./b + 1/,/6). (30) 


To obtain the 2nd moment of ‘he fraction of the area of A not covered, we calculate an 
integral similar to (17) or (18). The function 


T — 2na? + Q(r) 
7 
is derived from 2(r), which remains the same, but the frequency distribution ¢(r), the dis- 
tance between a pair of points chosen at random in the rectangle, is different. 
The co-ordinates x, and x, are uniformly and independently distributed in the range O, 
Jb (if we take Oz parallel to the longer side). The distribution of u = (2, —2,)* is thus seen 


from (25) to be df = 1—(u/b)du 





g(r) = 





Mtujb) 6 
vb— WM ay, 





“2 (31) 
while the distribution of v = (y, — y,)? is 
ge ie (32) 
(ov) 


* This problem was solved by Robbins (1945); see footnote on p. 1. 











na mt etlUrrrlUcrtlC O!|]lC WM]COUmDlltltCOh, 


-_ 














F. Garwoop 7 
The distribution of r = ,/(u+v) is obtained by integrating the product 


(Jo — Ju) (1 — (bv) 





-/(buv) 
over that part of the line w+ = r? within the rectangle 0<u<b, 0<v<1/b. This gives 
f(r) = $,(r) = 2r[m—2r(.Jb+J(1/b))+7?7] for r<1/Jd, (33) 

$(r) = $,(r) = 2r[2a—1/b—2r Jb(1—cosa)] for 1/Jb<r<.b, \ (34) 
where a = sin 1/r Jb, J 
and f(r) = $(r) = 27[2(a— 2) —b— 1/6 + 2r sin B/.Jb + 2r Jb cosa—r?] 
for Jb <r<.J(b+1/b), (35) 
where B = cos Jb/r. 


Thus the 2nd moment of Y can be found from (19)-(21) together with (30) and (33)-(35). 


7. OVERLAP OF RECTANGLES ON A FIXED RECTANGLE 


Assume that the fixed rectangle A has sides a and b and the covering rectangles C have sides 
a and f. The latter are assumed to be dropped with sides « parallel to the side a and with 
their centres anywhere inside the rectangle 7’, which is concentric with ab and has sides 
a+aand b+. To calculate the 2nd moment of the fraction ¥/A of A not covered by k C’s, 
we use (18) and calculate the expectation of 9(2,, 4, 22, Y2), the fraction of T not covered 
by two C’s with their centres (x,, y,) and (2g, y,) falling at random in A. The area common to 
two C’s is readily seen to depend only on the difference £ of the x co-ordinates of their 
centres and on the similar difference 7 of their y co-ordine.tes 


. In fact, the area can be 
written as 


Q(%1,Y4, Xa, Y2) = [«-E][F—-7], (36) 


where the symbol* [2] stands for x when x > 0 and is zero when x < 0, and we obtain 


[~—£][B—9]— 208 
(a+a)(b+f) ~ aaa 


To obtain the expectation of g*, we need the frequency distribution of £ and y. As in §5, 
€ is readily seen to follow the frequency distribution 





q(%, Yi, Xa; Y2) =1+ 





af = Sag (38) 
between O and a, with a similar distributon for 7, and we obiain the result 
4 (a= 8)[P—)—208\F eg 
i(Y/A) = aire [ (1-4 SEN (a 8) (b—n) dey. (39) 


If the kth power be expanded, the resulting integrals are, with the exception of the first, the 
product of integrals whose upper limits are a’ = min (a,a) and b’ = min (6, #) respectively. 
We obtain 


4 a’ tb’ as = ate k 
wi(V/A) = a [f(r EAA (a8) (bm) aay 


+o! . araexal (f.[@-8 (b-n)dgdn— [" [a8 (b—»)d&dy}. (40) 


* The writer is indebted to Neyman & Bronowski for this convenient notation (see below). 











8 The variance of the overlap of geometrical figures 
By @ simple change of variable we obtain 
4 [+ B uv—2aBh \F 
; == 14+-____—"_. - b—f)dud 
pA7 la) na ie * @+a) oD sie aia ceca 
tail -araerp) Pleat +ab—pe—[a— al b- AF, (Al) 
a*bh? (a+«a) (b+) ’ 
which is the result obtained by Bronowski & Neyman by a rather different method.* 


8. OVERLAP OF CIRCLES ON A FIXED CIRCLE 

We now consider a fixed circle A of unit area and therefore of radius b = 1/,/7, with & circles 
C of radius a dropped at random with their centres uniformly distributed over a circle 7’ 
of radius a+b. The 2nd moment of the fraction Y/A of A not covered is, as in §5, the 
expectation. of g*(r), where q(r) is the fraction of 7’ not covered by twe circles with centres 
falling at random in A a distance r apart. We have 
1 + 2a Ja — ma? + Q(r) 

(42) 

1+ 2a./n+7a* 

where {2(r), the overlap of two circles radius a with centres apart, is given by (19), (20) 
and (21) as before. We thus need the frequency distribution ¢(r) of the distance between two 
points chosen at random in the circle of unit area to obtain 





gir) = 


2/n 
ps(¥/A) = | "aktr) btr)dr. (43) 
To do this we use a fairly straightforward geometrical method, finding first the probability 
integral r 
P(r) =| dtryar, (44) 
0 


which is the probability that the distance between the two random points is less than r. 


The probability that the first point is between v and v+dv from the centre is 2vdv/b?, 
while if 


A(v) = area common to circles radii b and 7 with centres distance v apart, (45) 
it follows that the probability of the second point being within r of the first is A(v)/7b?. Hence 
F(r) = f AN) 5, (46) 

90? mb? 


Construct the triangle with sides r, b and v, and let the angles opposite to these be 0, ¢ 
and y. Then the following can be readily verified: 


If r<b, A(v) = 6°6+rd—brsny if b—r<v<b, 
: (47) 

= mr if O<v<hb-,r.| 

If b<r< 2b, A(v) = b°6+rd—brsiny if r—b<v<b, 
(48) 

= 7b? if O<v<r—b. 


The integration in (46) is carried out by parts, with y as the ultimate variable of integration, 
and to do this we obtain the result 





Py es — Za (49) 
Putting r = 2bsin 4a, (50) 


we obtain, over the whole range of r, 


— 
F(r) = 2 +1%7—a)— 708 _* sin a (51) 





Ja ie 
* And by Robbins (19465). 








It 
fu 








F. GaRwoop 9 


while differentiation yields the frequency distribution as 





$(r) = 2r(m—a) — seg (tet. (52) 


It will be noted as a matter of interest that the chance of the two random points falling 
further apart than the radius of the circle is 1— F(b) = oa or 9/22 nearly.* 
We thus obtain the 2nd moment of the uncovered area from (42), (43) and (52). 


9. USE OF PROBABILITY GENERATING FUNCTIONS 
(i) Binomial 
It is interesting to apply first the binomial distribution of k, the number of C’s dropped 
uniformly and at random on 7’. Assume that 7’ contains the centres of all the C’s which 
touch or cover A, and that S is some larger area including 7’. If/ C’s are dropped at random 
on S, the probability generating function of the number of centres falling on 7’ is 





= l 
G,(u)= (= =f =) (53) 
Thus, from the general result of §4, the mth moment of the fraction of A uncovered is the 


: S-—T+Tq\' 3 . : . 
expectation of | — 3 , where gq is the fraction of 7' uncovered by m C’s falling on A. 





But the expression within brackets is the fraction of S uncovered. The use of the binomial 
generating function is thus verified. 
(ii) Poisson 
The Poisson distribution next suggests itself. If the number of centres follows this dis- 
tribution with a mean of A per unit area, the probability generating function is 


G,(u) =eaTe-) (54) 
and the mth moment will be expectation over A of 
eATQ@-)), 
Alternatively, we could write this as 
Hm(Y/A) = exp (e~), (55) 
where Z = area of overlap of tn C’s falling on A. In particular, 
mean value of Y/A = w,(Y/A) = ere, (56) 


Thus the mth moment of Y/A is related to the characteristic function of Z, but this result 
does not appear to be of any theoretical importance: it does not, for instance, throw any light 
on the frequency distribution of Y/A. Formula (55) does, however, demonstrate the fact, 
which is otherwise obvious, that the area 7' does not enter into the frequency distribution of 
the fraction of A not covered by C’s whose fall follows the Poisson distribution. 

As far as the calculation of the variance is concerned, we need to calculate first the 2nd 
moment of e-42. In the cases where the falling areas are circles, the area of overlap Z of two 
circles is equal to 2C —Q(r), where Q(r) is the function given above ((19) etc.) for the area 
common to two C’s with centres r apart. The 2nd moment is thus 


e-2AC { eA) 6(r) dr, (57) 


where ¢(r) is the frequency function of r, the formulae for which are given above for the 
various cases. 


* The solution to this problem (no. 698), given by Whitworth (1897), contains an error, resulting in the incorrect 
value of 35/88 nearly. 








10 The variance of the overlap of geometrical figures 


In the case of rectangles falling on rectangles, the necessary formula for the variance is 
given by Neyman & Bronowski in the form of a series, viz. 


_ de A © (Aap)? af 
ma V/A) =~ tas Datel +20 


x {(8+ 2)a—a + [w—a] (1 —a/a)**4} {(8 + 2)b—B+[B—b](1—b/Ay*}. (58) 





Table 1. Variance of fraction of fixed area not covered by areas C 
falling according to Poisson distribution 
(i) Circles falling on square; (ii) circles falling on rectangle 2 x 1; (iii) circles falling on rectangle 4 x 1; 
(iv) circles falling on circle; (v) squares falling on square (sides parallel). 





Mean area not covered 






































Size of falling 
area C' +size Case 0-25 0-50 0-75 
of fixed area - 
Variance of area not covered 
0-2 (i) 0-0186 0-0303 0-0254 
(ii) 0-0182 0-0296 0-0248 
(iii) 0-0169 0:0275 0-0229 
(iv) 0-0190 0-0310 0-0260 
(v) 0-0182 0-0300 0-0252 
1-0 (i) 0-0608 0-0964 0-0789 
(ii) 0-0573 0-0904 0-0743 
(iii) 0-0489 0-0766 0-0627 
(iv) 0-0620 0-0983 0-0810 
(v) 0-0593 0-0945 0-0781 
1-8 (i) 0-0816 0-1262 0-1026 
(ii) 0-0771 0-1193 0-1040 
(iii) 0-0657 0-1015 0-0824 
(iv) 0-0829 0-1281 0-1040 
(v) 0-0790 0-1230 0-1002 








In general, the variance increases in the following order as between the different combinations of shapes: 
(1) Circles on rectangle 4 x 1, (iii). 
(2) Circles on rectangle 2 x 1, (ii). 
(3) Squares on square, (v). 
(4) Circles on square, (i). 
(5) Circles on circle, (iv). 

There is an exception in the last case considered, however (mean fraction of area not covered = 0-75, size of 
falling ar. fixed area = 1-8), when the order is changed somewhat. However, the variances are generally of the 
same approximate magnitude. 

(iii) Contagious distribution 

Neyman & Bronowski have included in their study the case of a contagious law of type A 

with two parameters (see Neyman, 1939). Here the probability generating function is 
Gy(u) = emtePo--2), 


They have pointed out that the expression of this as a series enables calculations to be utilized 
from the Poisson distribution. 


In the general case the sth moment of the fraction of A not.covered is the expectation of 


eme 2-1) (60) 


where Z is the area common to s (’s falling with their centres on A. 


(59) 








It 


co 


sa ut @ 








F. Garwoop 11 


10. NUMERICAL RESULTS 


It is impossible to calculate complete tables covering all cases, but it is of interest to calculate 


a few values for the purpose of comparison, and the following combinations of areas have been 
considered: 


Fixed area Falling areas 
(i) Square Circles 
(ii) Rectangle 2 x 1 Circles 
(iii) Rectangle 4 x 1 Circles 
(iv) Cirele Circ'es 
(v) Square Squares (with sides parallel to fixed square) 


In each case the fixed area has been made of unit size, while the falling areas were respectively 
C = 0-2, 1-0 and 1-8. The areas were assumed to fall according to the Poisson distribution, 
the number of centres per unit area being such that the expected fraction of the fixed area 
A not covered was respectively 0-25, 0-5 and 0-75. Since from equation (56) the expected 
fraction of A not covered is m = e-4°, the relations between A and C for the 9 combinations 


of m and C are AC = log,4, log,2 and log, 4/3. 
The 2nd moments were determined by quadrature from formula (57), where the falling areas 


were circles, and by direct evaluation of the series (58) for the case of squares on squares. 
The results are given in Table 1. 


11. EXPERIMENTAL INVESTIGATION 
Before the work of Robbins and Neyman & Bronowski was brought to the notice of the 
writer, an attempt was made to obtain experimentally a general formula for the variance of 
the fraction of area not covered. Attention was confined to the case of circles falling on 
squares, the centres of the former being chosen randomly (by means of random numbers) 
within the area 7’ whose boundary is at a distance a outside the sides of the unit square A. 


For each combination of C and k, samples of up to 200 in size were drawn. The various 
combinations were as given in Table 2. 


Table 2. Ranges of k and C covered in experimental determination of variance 








Area of circles dropped _ ‘ 
Area of square nial No. of circles = k 








0-0077 5, 10, 15 

0-031 5, 10, 15, 20, 30 
0-033 5, 20, 40, 80, 120 
0-25 1, 2, 4, 6 

a 1, 2, 4, 6 





Three methods were used to measure the fraction of the fixed square not covered in each 
sample. Method P involved the measurement of the covered area by planimeter. Method 
L utilized a photoelectric cell to measure the amount of light passing through a glass plate 
on which black paper disks had been stuck. Method C consisted of a simple count of squares 
on graduated paper, and generally this was the most convenient to operate. (The two neigh- 
bouring values of C were used to compare methods L and P.) 

For each combination of C and k the average fraction not covered, Y, was compared with 
the theoretical value m = '* -C/T). (61). 











12 The variance of the overlap of geometrical figures 
The observed standard deviation of the observations being s, the appropriate criterion for 
testing the mean is a Y—m 
ane 
P being the number of observations in the sample. The results are given in Table 3. 


Table 3. Comparison between observed and expected values of fraction of area not covered 
P = planimeter method. I = photoelectric method. C = counting method. 




















Area of circle} No. of No. in Mean aeen not covered ye Deviation t a 
Area of square circles sample eviation BP i. sinteiaiiee 
=C k P ~— Sapocted 8 8/JP pees 
0:0077 5 100 0-9694 0-9683 0-0055 2-0000* i 
0-0077 10 100 0-9394 0-9376 0-0088 2-0454* P 
0-0077 15 100 0-9105 0-9079 0-0090 2-8889f P 
0-031 5 100 0-8924 0-8961 0-0222 — 1-6667 P 
0-031 10 200 0-8007 - 0-8030 0-0334 —0-9739 i 
0-031 15 100 0-7134 0-7196 0-0404 — 1-5347 P 
0-031 20 100 0-6349 0-6449 0-0423 — 2-3641* if 
0-031 30 100 0-5057 0-5179 0-0493 — 2-4746* 
0-033 6 100 0-8853 0-8901 0-0250 —5-9200t L 
0-033 20 100 0-6326 0-6278 0-0500 0-9600 L 
0-033 40 100 0-3969 0-3941 0-0367 0-7629 L 
0-033 80 75 0-1617 0-1553 0-0358 1-4756 Land P 
0-033 120 50 0-0657 0-0612 0-0255 1-2478 
0-25 1 30 0-8916 0-8950 0-0824 — 0-2260 Cc 
0-25 2 110 0-7853 0-8009 0-1214 — 1-3477 Cc 
0-25 + 110 0-6263 0-6415 0-1481 — 1-0764 Cc 
0-25 6 110 0-4891 | 0-5138 0-1307 — 1-9820* Cc 
1-0 1 20 0-7583 0-7653 0-2350 —0-1332 Cc 
10 2 50 0-5890 0-5856 0-2346 0-1025 C 
1-0 4 50 0-3861 0-3430 0-2334 1-3057 Cc 
10 6 50 0-1686 0-2008 0-1630 — 1-3968 Cc 





























* Between 5 and 1% levels. t+ Beyond 1% level. 

An examination of the values of ¢ shows that too many of them are outside the 5% 
significance levels, while in each set corresponding to one value of C the values are too 
frequently of the same sign. The worst deviation is for C = 0-033 and k = 5, with t = — 5-92, 
but this only corresponds to a difference between the observed mean of 0-885 and the 
theoretical mean of 0-890. The test is thus very sensitive and the deviations are not serious, 
and they arise from imperfections in the technique which have not been investigated in detail. 

As regards random errors of measurement as distinct from bias, it was not possible to 
carry out a systematic estimation of the contribution of this source to the total variation. 
A series of repeated measurements for the case C = 0-031, k = 30, for which the observed 
mean was 0°51, showed that the individual measurements had a standard error of about 
0-009. As the total standard deviation in this case was 0-049, the true estimate of the standard 
error (i.e. omitting the error of measurement) was ,/(0-049? — 0-009?) = 0-048, and for our 
purposes this difference is negligible. There is thus some evidence for assuming that this 
method of estimating the variance was satisfactory. 


12. DERIVATION OF EMPIRICAL FORMULA FOR THE VARIANCE 
The consideration that the theoretical variance o? of the fraction uncovered must be 


small whenever the mean m of this fraction is near the limits of its range, zero or unity, 
suggests that we might try the relation 


a? ~m(1—m) 











of} 
ca 


F. Garwoop 












































G-13—- 6-€3 6£€0-0 0240-0 | 99T-0 9920-0 €-P 9 OT 
6-0 £6 O0FS0-0 0690-0 60-2 FEZ-0 1¥Z-0 S900 -F ¥ OT 
¢-oT— €-3— TS90-0 9€90-0 €66-0 0ss0-0 €-F j OT 
L:8 €-L—- 8090-0 TL¥0-0 LOE-O 6290-0 Sd I OT 
7 6-¢ S810-0 9610-0 | 1890-0 IL10-0 ¢-6 9 SZ-0 
PFS os 9L10-0 0810-0 IF 0280-0 $2600 6120-0 o-6 ¥ SZ-0 
S-61 0< €Z10-0 6FZ10-0 $260-0 L¥I0-0 6 3 SZ-0 
= L-0 T€L00-0 9€L00-0 €ZL0-0 6900-0 ¢-6 I SZ-0 
— a 19*000-0 ET 10-0 029000-0 G-€P OzT €€0-0 
cae Pe — SOT00-0 | 9L600-0 82100-0 G-€P 08 €£0-0 
esi — roy 26100-0 19-3 9800-0 -, ¥9900-0 S€100-0 G-EP OF €£0-0 
Stat _ it 88100-0 TLOTO-0 0900-0 c-€P 02 €£0-0 
= a — $8L000-0 8900-0 $Z9000-0 cP ¢ €£0-0 
= ~—— = 8100-0 4600-0 €4200-0 1-9F 0€ T€0-0 
er co = 89100-0 | 0820-0 6L100-0 1-9F 06 1€0-0 
oe =— = 8F100-0 8E-Z 6900-0 01800-0 €9100-0 1-9F SI 1€0-0 
= ia “" 9T100-0 ¥0L00-0 61 100-0 L-9F or Té0-0 
= ~ 28 ¥89000-0 ‘0900-0 €6F000-0 L-9F ¢ 1€0-0 
7 a = 0660000-0 f 986000-0 0180000-0 L-ccT ST LLO0-0 
= — — 2690000-0 OL-é 80100-0 962 100-0 FLL0000-0 L-¢cT or LL00-0 
— Pa tas €9€0000-0 €L6000-0 €0€0000-0 L-¢cT g LLO0-0 
pon 
T aa a Ad) ( )6 4 Fe) 
8 oD QOUBLIBA oe = e= , E = 
usoue | muoxe | joongea |(—Duee | (Ox | gioeones | (IM _ 9 | ates Qik | sajomo | axenbs jo vary 
strona | ofeweang| “tommy | sommes | HO/a) | “Foe | a =* | pein | aw | etoge |S cay 
peorndarg 





mnusof yoorsdwua fo uoynarsgd “F IQe J, 





or 











8*/m(1--m) 


9 





The variance of the overlap of geometrical figures 


0-8 
0-6 


0-002 


oO CO} 
00-0008 


0:0006 


00-0004 


00002 


0-ooOo! 





9(c)=23 (</t)™ 


an Representing of= 230 I-m A 


empirical formula for variance. 





























OBSERVATIONS 
MEANS 























Tegan 








4) 
LZ 



























































20 40 60 80100 200 


. Derivation of empirical formula for variance. 














F. Garwoop 15 


for given C. Accordingly we have calculated the quantity 
g2 


ae m(1—m)’ 


(62) 
and the results are given in Table 4. 

For each value of C the values of g are by no means constant, but the variation is not 
excessive, and it is considered that for practical purposes we can take g to be a function 
of C, at least over the range considered. To obtain a suitable form for this function, it was 
decided, for very general reasons, to seek a simple relation between g(C) and 7'/C, the latter 


being roughly the number of C’s which could be placed on T if they could be fitted together 
without overlapping. 


Table 5. Comparison of empirical formula for variance with exact value 


o* = exact value, o,? = empirical value, H = percentage error = 100 (o,?— o*)/o*. 
































na? k=1 k=2 k=3 k=4 k=5 k=6 

0-2 o 0-00501 0-00873 0-0114 0-0132 0-0144 0-0150 
o,? 0:00516 0-00896 0-0117 0-0135 0-0147 0-0154 

E +30 +26 +26 +23 +21 +27 
0-4 o 00156 0-0246 0-0290 0-0305 00300 0-0274 
a? 0-0149 00237 0-0284 0-0304 0-0305 0-0294 

EB —45 —3-7 ~2:1 —03 +17 +73 
06 o 0-0277 0-0403 00447 0-0427 00395 0-0339 
o,2 00257 . 0-0384 00431 0-0433 0-0407 0-0370 

B —7-2 —47 ~3-6 +14 +30 +91 
08 o 0-0398 00541 0-0552 0-0501 0-0428 0-0351 
2 0-0365 0-0517 00551 0-0525 0-0471 00407 

EB ~8:3 —4-4 —0-2 +48 +160 +160 
10 o? 00508 0:0651 0-0628 0:0540 00436 00339 
| oe 0-0471 0-0636 0-0648 0-0590 00507 0-0420 

| # ~73 —23 +32 +93 +163 +23-9 
12 | o 0-0605 0-0737 0-0677 0:0554 0-0427 0-0317 
| o,! 00571 00740 0-0724 0-0635 00525 00419 

= —56 +0-4 +69 +146 + 22-9 +325 
1-4 o? 00689 0-0804 0-0706 0-0554 0-0410 0-0292 
oe 00667 0-0832 0-0786 00665 00531 0-0411 

BE ~3-2 +35 +113 +200 +29°5 440-7 
16 o 0-0763 0-0855 00723 0-0546 0-0389 0-0267 
o, 00757 0-0913 0-0834 0-0684 00530 0-0398 

| E —08 +68 +15-4 +253 + 36-3 +49-1 
18 o 0-0828 0-0895 00730 0-0531 0-0367 00244 
a? 0-0843 0-0985 0-0873 0-0695 0-0524 0-0383 

B +18 +101 +196 +309 +428 +570 




















Fig. 2 shows the result of plotting the observed mean value of g(C) against 7'/C on log- 
arithmic scales. The points (the values of s?/m(1—m) for various values of k are plotted in 
addition to the mean), lie reasonably close to a straight line of slope — 1-5, indicating the 
—— g(C)~ (C/T). (63) 
The values of (7'/C)#g(C) are shown in Table 4, where the values are seen to lie between 2 
and 2-5 with an average of 2-3. Thus we derive the rough empirical formula 

<a 
(T/C)! (64) 


where m = (1-—C/T). 








16 The variance of the overlap of geometrical figures 


These are given in Table 4, together with the values in some cases of the percentage error of 
o? compared with the true value o* obtained from the methods described in § 5; the per- 
centage error in the estimate s* observed experimentally is also given. (It was not possible to 
evaluate o? in all cases, as the computation is somewhat laborious.) Another set of com- 
parisons between the empirical and the exact formula is given below in Table 5, over the 
range C = ma? from 0-2 to 1-8 and k = 1, 2,3, 4, 5,6. 

It will be seen that the empirical formula gives quite a satisfactory fit, e.g. with an error 
less than 10 %, over a considerable part of the range studied, but that the error tends to 
increase, i.e. the formula exaggerates the variance, as C and k increase. 


13. USE OF THE EMPIRICAL FORMULA FOR THE POISSON CASE 
If k circles are dropped with their centres falling at random on the area 7' the mean area not 
covered can be written as m(k) = (1—C/T)*, 
and the empirical formula for the variance as 


2-3m(k) [1 — m(k)] 


H(k) = (T/C) 





Table 6. Comparison of empirical formula for variance of area not covered 
in case of circles falling on square according to Poisson distribution 











Size of falling : Mean area not covered, m 
area C + 
fixed area 
0-25 0-5 0-75 
0-2 a 0-0186 0-0303 0-0254 
o; 0-0196 0-0308 0-0257 
% error 5-4 1-7 1-2 
1-0 o 0-0608 0-0964 0-0789 
co? 0-0669 0-0981 0-0781 
% error 10-0 1-8 1-0 
1-8 a 0-0816 0-1262 0-1026 
a? 0-0942 0-1348 0-1057 
% error 15-4 6-8 3-0 























Hence if k follows the Poisson distribution with expectation AT’, the total variance based 
on the empirical formula is 


© ,-AT 
f= > : — Y* [am2(Ke) + peo (k)] — m?, 
k=0 
where m = expected area not covered 
= eACc, 


We find after expanding m*(k) that 


2-3 2-3m 
7 mit Fe roy ment rey si 


This formula is compared with the exact values in Table 6 over the same range as in Table 1. 


(65) 








re 
h 








F. Garwoop 17 


The agreement is again reasonably satisfactory over the greater part of the range, large 
positive errors occurring for large values of C and small values of m. These errors. might be 


reduced by using a constant rather smaller than 2-3 in the empirical formula, but the point 
has not been investigated further. 


SUMMARY 


The mathematical study of bombing has given rise to the following problem. A fixed 
outline, such as a square or circle, is drawn on a plane, and other similar outlines are dropped 
at random on it. Estimates are then required of the variance of the fixed area which is not 
covered. Work by Robbins enables a theoretical formula to be derived, and Bronowski 
& Neyman have treated, by an independent method, the special case of rectangles falling 
on rectangles. 

It is shown that in the case of circles falling on circles, squares or rectangles, the variance 
can be expressed as the integral with respect to r of the product of two functions, one being 
a simple function of the area of overlap of two circles with centres r apart, and the other 
being the frequency function of the distance r between two points chosen at random in the 
‘covered’ area. This applies both to the case where the number of falling areas is fixed and 
where it follows a Poisson distribution. Numerical values have been calculated for a 
number of cases. An experimental method had been carried out prior to the above theo- 
retical work, and the following empirical formula was derived for the variance of the 
fraction of a fixed square not covered by k circles, area C, falling at random on an area T' 
containing centres of all C’s which cover or touch the fixed square: 


2-3m(1 —m) 
(riot 
where m=mean fraction of area not covered 
=(1-C/T). 


This formula, and its extension to the Poisson case, have been shown to be in reasonable 
agreement with the exact values over a considerable range. 


The writer is indebted to Miss G. O. Jeffcoate for valuable assistance in the computing 
and experimental work. 


REFERENCES 


BronowskI, J. & Neyman, J. (1945). Ann. Math. Statist. 16, 330. 
Neymany, J. (1939). Ann. Math. Statist. 10, 35. 

Rosstns, H. E. (1944). Ann. Math. Statist. 15, 70. 

Rossins, H. E. (1945). Ann. Math. Statist. 16, 342. 

Wuitwortn, W. A. (1897). DCC Exercises in Choice and Chance. Cambridge. 


Biometrika 34 








[ 18 ] 


A STUDY OF A FIRST DYNASTY SERIES OF EGYPTIAN SKULLS 
FROM SAKKARA AND OF AN ELEVENTH DYNASTY SERIES 
FROM THEBES 


By A. BATRAWI, Pu.D. anp G. M. MORANT, D.Sc. 


1. Introduction. This paper deals with forty-four male crania of Ist dynasty date 
(c. 3400 B.c.) discovered at Sakkara by Macramallah Effendi, who has published a report 
on the excavations (1940), and with fifty-five crania of 11th dynasty (c. 2000 B.c.) soldiers 
unearthed at Thebes in 1927 by the Egyptian Expedition of the Metropolitan Museum of 
Art, New York (Winlock, 1928). The cemetery at Sakkara, 20 miles south of Gizeh, was used 


by the middle classes of the local community. Prof. D. E. Derry has kindly provided the 
following notes on it: 


The lst dynasty cemetery at Sakkara excavated by Macramallah Effendi is of special interest. 
Comparatively few cemeteries of this date have been found, and, while the total number of forty-four 
skulls from which reliable measurements could be taken was small, yet the results yielded by these 
are such as to show that we are dealing with a race which differs in important features from those 
exhihited by the so-called predynastic people. 

The observation that there were two races in Egypt in the early dynastic period was first made in 
the year 1909, when the results of measurements obtained from a series of male and female skulls of 
the 4th and 5th dynasties from the great necropolis surrounding the pyramids of Giza came to be 
examined and compared with crania from early predynastic graves. Until then the theory of an 
unbroken evolution of the Egyptian race from prehistoric times right through the dynastic period had 
been taught. It now became obvious that the culture which we know of as peculiarly Egyptian was 
associated with a race which could not have been derived from the predynastic people. The introduction 
of stone-working resulting in the erection of great tombs and statuary, as well as beautifully executed 
reliefs, paintings and above all writing, all pointed to a race far in advance of the predynastic people, 
who although skilled in the making of bowls and vases in stone as well as in pottery, and who had 
already attained to the discovery of the uses of copper, were, nevertheless, little removed from the 
Neolithic period. 

The cemetery is unusual in consisting entirely of males. In the note on the skulls published in 
Macramallah Effendi’s report it is stated that there were some females included in the collection. After 
the report had gone to press Macramallah Effendi informed the writer that a part of the cemetery 
was of 18th dynasty date. It turned out that all the female skulls came from this part and that therefore 
the Ist dynasty cemetery contained only remains of males. Dr Batrawi’s examination of the figures 
confirms the statement made at the beginning of this note and shows the closeness of the relationship 


of the people of the Ist dynasty at Sakkara with those of the 4th and 5th dynasties from Deshasheh 
and Medum. 


In his report on the discovery of the series of 11th dynasty skeletons at Thebes Mr H. E., 


Winlock (1928) says that they were found in ‘a tomb in the row where the grandees of 
Mentuhotep’s court had been buried’. He remarks: 


Obviously what we had found was a soldiers’ tomb. To judge from the cheapness of their burial they 


were only soldiers of the rank and file, and yet they had been given a catacomb presumably prepared 


for the dependents of the royal household, next to the tomb of the chancellor Khety. Clearly that was 


an especial honour. If we are right in supposing that all had been buried at once, they must have been 
slain in a single battle. 


Prof. Derry examined the bodies on the spot, and he took measurements of the crania 
and of some of the long bones. About sixty bodies were counted and all proved to be those 








rel 








A. BATRAWI AND G. M. Morant 19 


of adult males who had died in the prime of life. Prof. Derry says that the skeletons were 
reburied after they had been examined. 

2. The measurements of the crania. The Sakkara series was sexed and measured by 
Prof. Derry and we are indebted to him for allowing us to use his records in this paper. 
All the absolute measurements, given in Appendix II, are his readings with the exception 
of those of the foramen magnum, which were taken by one of the writers (A.B.) of this 
report. The measurements of the Thebes series were also kindly provided by Prof. Derry, 
together with means he had calculated. The readings for individual crania are given in 
Appendix ITI. : 

The technique of measurement followed by Prof. Derry is that of the Monaco Congress 
(Duckworth, 1913). He had used this when measuring the predynastic Egyptian series of 
skulls from Badari, of which part was remeasured later in London by Miss B. N. Stoessiger 
(1927), who followed the biometric technique. The two sets of measurements of the same 
fifty-three specimens have been compared (Morant, 1935), thus showing in detail what 
relations are to be expected between readings obtained by following the two techniques. 
These results were taken into account in preparing the definitions of Prof. Derry’s measure- 
ments given in Appendix I below. The characters are denoted as far as possible there and 
in the tables by the customary index letters of the biometric technique. 

3. The nature of the two series. Mean measurements and standard deviations for the two 
series are given in Table 1. The longest series of Egyptian skulls measured, known as the 
E series, came;from a cemetery at Giza. used from the 26th-30th dynasties (Davin & 
Pearson, 1924). Judging from comparisons of constants for a number of cranial characters, 
most of the other ancient Egyptian series described exhibit almost precisely the same order 
of variation as the one from Giza. In general they have been found to be rather less variable 
than European cranial series, while there is no evidence that there was any appreciable 
change in the variation exhibited by Egyptian populations during the long period from 
early predynastic to Roman times. 

The two new series are shorter than several from Egypt previously described. Counting 
the number of characters for which the standard deviation for one series is greater or less 
than the corresponding constant for the other series, the situation is: 


Sakkara and Thebes: Sakkara s.p. greater for nine and less for ten characters; 
Sakkara and Giza: Sakkara s.p. greater for four and less for eleven characters; 


Thebes and Giza: Thebes s.D. greater for eight and less for nine characters. 


This crude comparison suggests that there can have been no marked differences between 
the variabilities of the three populations represented. As sets of differences are considered, 
the limit of significance accepted may be taken considerably higher than in the case of 
a single difference. Suppose that there is a real distinction if two of the standard deviations 
differ by an amount which is 3-5 or more times its probable error. Then one significant 
difference is found for the Sakkara and Thebes series (VH, L, Sakkara s.p. greater, 
A/p.x.A = 3-8), none for the Sakkara and Giza, and three for the Thebes and Giza series 
(H’ 4-1, S, 3-5, S, 3-5, Giza s.p. the greater in all three cases). The two new series are too 
short to give reliable comparisons, but the evidence suggests that the populations they 
represent were equally homogeneous, while both were rather less mixed in racial com- 
position than the 26th-30th dynasty population of Giza. 


2-2 








20 Egyptian skulls 


4. Comparisons of mean measurements. Following biometric practice, it may be supposed 
in such a case that no statistical analysis of the series can-reveal its racial components. 
The relationships of the series have to be judged by comparing them as wholes, on the basis 
of mean measurements, with other series known to exhibit unexceptional variation. It 
was shown by Morant (1925) that the recorded series of ancient Egyptian skulls can be 
divided into two groups. These were called, for convenience, the Upper and Lower Egyptian, 


Table 1. Means and standard deviations (with probable errors) of the Sakkara 1st dynasty 
and Thebes 11th dynasty series of male skulls 


























Means Standard deviations 
Character* 
Sakkara Thebes Sakkara Thebes 
L 186-9 + 0-56 (41) 181-8 +0-53 (54) 5-31+0-40 5-75 +0°37 
B 138-7 +0-41 (43) 138-3 +0-41 (54) 3-99 + 0-29 4:52 +0-29 
B 96-5 + 0-39 (36) 93-6 + 0-43 (55) 3-48 + 0-28 4-72 +0-30 
H’ 135-4 + 0-67 (32) 137-1+0-37 (51) 5-63 + 0-47 3-91+0-26 
{Aur. ht.] 114-8 + 0-67 (27) a= §-20+0-48 as 
“B 102-7 + 0-57 (29) 100-7 + 0-34 (46) 4-57 +0-40 3-40 + 0-24 
a 518-8+1-5 (29) 507-4+1-3 (50) 12:0 +11 13-2 +0-89 
8; as 125-7 + 0-47 (52) es 5-01+0-33 
8S, —- 129-4 + 0-56 (53) =~ 6-00 + 0-39 
8, —- 115-2 + 0-82 (51) = 8-63 + 0-58 
S 370-54+1-2 (50) — 12-7 +0-86 
[Broca’s Q’] ~ 300-3 + 0-84 (49) “ 8-72 + 0-59 
fml 36-7 + 0-26 (31) , — 2-18+0-19 — 
‘mb 30-4 + 0-22 (29) — 1-74+0-15 — 
[@’H} 71-9+0-55 (30) 72-0 + 0-37 (45) 4-43 + 0-39 3°71 +0-26 
GB 96-5 + 0-62 (25) 95-5 + 0-48 (38) 4-57 +0-44 4-40 + 0-34 
J 127-8 (14) 127-6 + 0-52 (32) - 4:33 + 0-36 
[NH, L} 51-2 + 0-50 (29) 51-8 + 0-25 (45) 4-00 + 0-35 2-52 40-18 
NB 25-4+0-21 (30) 25-0 + 0-20 (42) 1-70+0-15 1-:92+0-14 
[0;,'] 38-9 + 0-24 (26) 39-14 0-16 (44) 1-84+0-17 1-55 +0-11 
[0.] 32-5 + 0-20 (26) 33-1+0-23 (44) 1-50+0-14 2-23+0-16 
[Prosthion GL] 99-6 + 0-56 (26) 96-5 + 0-47 (43) 4-23 + 0-40 4-61+0-34 
100 B/L 74-2 + 0-26 (39) 76-1 + 0-26 (54) 2-44+0-19 2-84+0-18 
100 A’/L 72-8 +0-41 (30) 75-5 + 0-29 (51) 3°37 + 0-29 3-06 + 0-20 
100 B/H’ 102-6 + 0-60 (31) 100-8 +.0-41 (51) 4-95 + 0-42 4-34 + 0-29 
100 fmb/ fml 83-3 + 0-72 (29) ne 5-76 +0-51 ae 
[100 @’H/GB) 74-3+0-50 (25) 75-8 +.0-53 (38) 3°70 + 0°35 4-83 + 0-37 
[100 NB/NH, L) 49-5 +0-58 (29) 48-3+0-48 (42) 4-63 +0-41 4-61+0-34 
[100 0,/0,] 83-6 + 0-51 (26) 84-6 + 0-49 (44) 3-86 + 0-36 4-81 +0-35 











* The characters are defined in Appendix I. A symbol in square brackets denotes either that the measurement 
is one not usually included in the biometric technique, or else that Prof. Derry’s method of taking the measure- 
ments does not accord with biometric practice. 


though there is evidence that the regions represented changed somewhat with time. The 
series in the first group came from the neighbourhood of Thebes and sites farther south, 


while those in the second group came from the same region of Upper Egypt and sites © 


farther north. The first group includes all the predynastic series that have been described 
and some of dynastic date, the latest being of the 18th dynasty: the second group ranges 
from the Ist dynasty to Roman times, though no series available earlier than the 4th 
dynasty had come from the region immediately south of the Delta. The Sakkara series 
described in the present paper extends the range of such material back to the Ist dynasty. 



































A. BaTRAwI AND G. M. Morant 21 


It had been found that the means for all these series are almost constant for most of the 
metrical characters commonly recorded, but for a few measurements more significant 
differences are found and these separate the two groups of series. Characters of both kinds 
are treated in Table 2, which is based on Table XIII in Risdon’s paper (1939) on the human 
remains from Lachish (Palestine). The first six characters are those which make the clearest 
distinction between the Upper and Lower Egyptian types of series, and they are all breadths 
or dependent on breadths—the latter being the horizontal circumference and the two 
indices—of the cranium. The Sakkara series is clearly assigned to the Lower Egyptian 
group, and if counted as a member of this the range of the mean minimum frontal breadths 
(B’) for the group is slightly extended. The Thebes series is also assigned to the Lower 
Egyptian group by four of the six characters in question: for U and 100 B/H’, however, 
its means fall within the ranges given for the Upper Egyptian type of series. 


Table 2. Ranges of mean measurements for two groups of series of ancient Egyptian 
male crania and means for the Sakkara and Thebes series* 





Series Period B J B U 





Upper Egyptian type | Early predyn.—18th dyn. | 131-4-134-3 (10) | 123-6-127-5 (8) 90-4-92-8 (4) | 500-0-510-4 (4) 
138-7 96-5 518-8 











Sakkara Ist dyn. 127-8 

Thebes 11th dyn. : 138-3 127-6 93-6 507-4 

Lower Egyptian type | lst dyn._Roman 135-3-139-3 (9) | 127-5-131-3 (8) 93-0-96-2 (5) | 510-8-518-7 (5) 
Series Period 100 B/L 100 B/H’ L H’ q 





Upper Egyptian type | Early predyn.—18th dyn. | 71-7—73-7(10) | 98-1—101-1 (10) | 182-2—185-2 (10) | 132-4-135-9 (10) 
Sakkara Ist dyn. 74-2 102-6 186-9 135-4 
Thebes 11th dyn. 76-1 100-8 181-8 137-1 
Lower Egyptian type | lst dyn._Roman 73-7-76-0 (9) | 102-3-106-4 (9) | 181-4-185-8 (9) | 130-7-136-0 (9) 























* The characters are defined in Appendix I. The numbers in brackets give the numbers of series to which the ranges 
relate. In the case of these previously described series the smallest number of crania on which any one of the means is 
based is 16, though this minimum number is about 30 for most of the characters. The numbers on which the Sakkara 
and Thebes means are based can be seen from Table 1, the only one less than 26 being 14 for the bizygomatic breadth 
(J) of the Sakkara series. 


The last two characters in Table 2, which are the length and height of the cranium, fail 
to distinguish the two contrasted groups of series. The means for the two new series fall 
outside the ranges previously given by all the ancient Egyptian material, the Sakkara 
series giving the greatest L and the Thebes the greatest H’. The evidence of other characters 
must be taken into account, but so far the comparisons suggest that the two new series 
are of the Lower Egyptian type, and it is to be expeeted that they bear a closer resemblance 
to some of the series assigned to that group than to any other cranial series. 

At the same time it may be noted that the Sakkara Ist dynasty and Thebes 11th dynasty 
populations are clearly differentiated by their mean cranial measurements. There are twenty 
characters in Table 1 for which means for both series are available. The most significant 
difference is for L, and it is 6-6 times its probable error, while five other characters—B’, U, 








22 


prosthion GL, 100 B/Z and 100.H’/Z—also show differences which exceed four times their 
probable errors. 

5. Comparisons by coefficients of racial likeness. The method of Karl Pearson’s coefficient 
of racial likeness has been applied extensively to series of ancient Egyptian crania. Risdon 
(1939) has given comparisons made in that way for twenty-two male series, including three 
from sites outside Egypt, and the treatment below is almost restricted to comparisons 
between these and the two new series described in the present paper. The procedure fol- 
lowed in applying the method described in several papers in Biometrika was adopted without 
modification.* 

In deriving a classification of a number of cranial, or living, series from the coefficients 
of racial likeness found between them, it has been shown repeatedly that the most sug- 
gestive arrangement is obtained if the closest resemblances of the series, indicated by 
coefficients below a certain value, are alone taken into account. Risdon has given a diagram 
(1939, Fig. 3) showing all the reduced coefficients less than 5-0 between the twenty-two 
series with which he dealt. There are fifty-three of this lowest order among the 231 
(= 22 x 21/2) comparisons. The addition of the two new series to the classification referred 
to only requires a knowledge of the reduced coefficients less than 5-0 between them and 
the twenty-two series. 

It has been pointed out that inspection of a few mean measurements can indicate whether 
a comparison of two particular series would almost certainly give a reduced coefficient 
greater than the limit chosen, or whether it might provide a value less than 5-0. The 
measurements used for this rapid test are six which are known to be those which show the 
most significant differences, and the greatest proportions of such differences, in comparisons 
of the group of series. These are the length, breadth and height of the brain-box and the 
three indices derived from these chords. For the fifty-three comparisons of the twenty-two 


Egyptian skulls 


* A ‘crude’ coefficient is defined by 


(Ms ae , 2 
™ = 8| mate +n,’ * ¥|- 140-6745, /—,, 


where M, is a mean based on n, crania for the first series, M,’ and n,’ are the corresponding constants for the 
second series and m characters are compared. The o’s of the long 26th-30th dynasty Egyptian series were 
used throughout. The crude coefficient may be written 


+ g(a)—140-6745 /2 , where a= Bie, Me {Me MY 

m m Net+N, o; 
Its value is largely determined by the sizes of the two samples that happen to be available, if in fact they do not 
represent the same population. As many excavated crania are damaged to some extent, in the case of a particular 
series means for different characters will usually be based on various numbers of specimens (see Table 1). The 
mean number available for the characters uséd is denoted by 7, in the case of the first series and by 7,’ in the 
case of the second series, and these ‘sizes’ of the samples are usually unequal and may be of very different orders. 
To obtain, as far as possible, a measure of the absolute divergence of the types compared which does not depend 
on the numbers of crania available, a ‘reduced’ coefficient of racial likeness is computed. This is defined to be 


100 x 100 | N, tn, a 
100 +100 * the | Z sta)—-1 200745, =]. 





A reduced coefficient may be supposed a ‘at REO to the value which would be obtained if all the 
means for both series were for 100 individuals instead of for the numbers actually available. If a crude 
coefficient differs from zero by less than 3-5 times its probable error—a rare occurrence—then it is supposed 
that there is no evidence of a significant distinction between the two populations represented. In this case 
there is no need to compute a reduced coefficient. Otherwise, reduced coefficients are found and the classification 
of a number of series is based on these. 








se 
cl 


Ci 


= 








A. BaTRAwIi AND G. M. Morant 23 


series giving reduced coefficients less than 5-0, the maximum differences for the six 
characters (in mm. or units of the indices) are: 

L B H’ 100 B/L  100H’/L ~—- 100 B/H’ 

3-1 3-0 3-5 2-0 2-3 2-8 

To avoid the danger of missing comparisons which might be of the order required, in 
applying the test each of these values was increased arbitrarily by 0-2 giving: 

L B H’ 100 B/L 100 H’/L 100 B/H’ 
3:3 3-2 3-7 2-2 2-5 3-0 

In comparing a new series with the twenty-two it may be supposed that a reduced 
coefficient of racial likeness greater than 5-0 would almost certainly be found if the dif- 
ference between the means is greater than the accepted limit in the case of any one or more 
of the six characters. For such comparisons the coefficients were not calculated. If the 
differences between the means are less than the limits for all six characters then a reduced 
coefficient less than 5-0 might be found: the coefficients were calculated in all such cases. 
In this way detailed comparisons were judged to be required between the new Sakkara, 
ist dynasty, series, on the one hand, and six of the twenty-two treated by Risdon on the 
other; and between the new Thebes, 11th dynasty, on the one hand, and only two of the 
twenty-two series on the other. The previously described series involved in these two sets 
of comparisons—-one series being included in both sets—are: 

(i) Deshasheh and Medum, 4th and 5th dynasties (Thomson & Maclver, 1905). The two 
towns are south of Sakkara and both less than 40 miles from it. 

(ii) Gizeh, 26th-30th dynasties (Davin & Pearson, 1924). 

(iii) Sedment, 9th dynasty (Woo, 1930). 

These three and the new Sakkara series are all from Lower Egypt among the total 
twenty-four series referred to above. All the other Egyptian sites mentioned are in Middle 
Egypt and close to Abydos and Thebes. 

(iv) Abydos, 18th dynasty (Thomson & Maclver, 1905). 

(v) Abydos, Ist dynasty, royal tombs (Morant, 1925). 

(vi) Lachish, Palestine (Risdon, 1939). This series represents an Egyptian population. 
It is assigned to the seventh and eighth centuries B.c., though it is not well dated. 

(vii) Tigré district, Abyssinia, modern (Sergi, 1912, means given in Morant, 1925). 

(viii) Cretans, modern ‘von Luschan, 1913, means given in Woo, 1936). This series is 
not one of the twenty-two dealt with by Risdon. It was included because of its close 
resemblance to the new 11th dynasty series from Thebes. The test based on a comparison 
of the means of the six calvarial measurements shows that the only ancient Egyptian series 
which might give reduced coefficients less than 5-0 with the Cretan series are the Theban 
1lth dynasty and the Sedment series ((iii) above). 

It must be emphasized that a reduced coefficient of racial likeness less than 5-0 represents 
a very close degree of resemblance. Values of that order have only been found between 
cranial series which would be expected, on account of their provenance, to represent the 
same or closely related populations. There is a danger that low reduced coefficients may be 
misleading owing to the influence on them of extraneous factors, such as inaccuracy in 
sexing or slight and unappreciated differences between the methods of measurement of 
two recorders working independently. It is safe to suppose that the two new series are 
made up entirely of the crania of adult males. In computing coefficients with them care 


a 











24 Egyptian skulls 


was taken to restrict a particular comparison to pairs of means based on measurements 
obtained by following precisely the same technique. 

Owing partly to that restriction, the numbers of characters that could be used in com- 
puting coefficients with the new series are decidedly smaller than the 31 used ideally for 
the purpose. For these comparisons the smallest number of characters used is 9 and the 
largest number is 18.* Risdon (1939, pp. 131-2) has examined the matter experimentally 
and he concluded that use of a smaller number of characters—the set of 14 he considered 
being very similar to the sets we were able to use—can usually be expected to give a fairly 
close approximation to the result which would be obtained from about twice as many 


Table 3. Coefficients of racial likeness between ancient Egyptian, a Palestinian (Lachish) 
and modern series of male skulls from Abyssinia and Crete* 





— 


Crude Reduced 
Series E C.R.L. + P.E. C.R.L. 





Sakkara, Ist dyn. (32-1) with Deshasheh and Medum, 4th and 5th dyn. (46-0) | 0-19+0-32 (9) 
(32-1) with Abydos, i8th dyn. (49-9) —0-08+0-32 (9) 
(31-6) with Lachish (249-3) 1-:04+0-25 (14) | 1-85+0-45 


(31-6) with Gizeh, 26th-30th dyn. (885-7) 2-45 + 0-25 (14) | 4:02+0-4) 

(31-6) with modern Abyssinian (61-4) 2-43 + 0-25 (14) | 5-82+0-60 

(31-6) with Abydos, Ist dyn. royal tombs (33-6) 1-91 +0-25 (14) | 5-87+0-77 

(30-3) with Thebes, 11th dyn. (46-7) 4-26 + 0-22 (18) | 11-45+0-59 
Thebes, llth dyn. (49-2) with Sedment, 9th dyn. (37-9) —0-43 + 0-28 (12) 


(49-0) with modern Cretans (50-4) ~ 2-01+0-30 (10) | 404+0-60 
(48-3) with Deshasheh and Medum, 4th and 5th dyn. (46-0)! 2-23+0-32 (9)| 4:73+0-68 














Setment, 9th dyn. (37-5) with Deshasheh and Medum, 4th and 5th dyn. (39-9) | 1-88+0-25 (14) | 486+0-65 
(37-7) with modern Cretans (47-9) 2-71+0-25 (15) | 6-42+0-59 

















* The numbers in brackets following the names of the series are the mean numbers of crania for the characters 
used in computing the coefficients. The numbers in brackets following the crude coefficients are the numbers 
of characters on which they are based. Woo (1930) gives coefficients with two of the series in the table above, 
and the values there differ from his because they were recalculated omitting the term 1/m, which was discarded 
after 1930. The standard deviations of the long E series of 26th-30th dynasty crania from Gizeh (Davin & 
Pearson, 1924) were used in computing all the coefficients in the table. 


characters. Occasionally, however, use of a smaller number of characters may suggest 
a rather misleading conclusion, and it will tend to indicate a rather wider separation of the 
series than that which would be found if all 31 characters could be used. With these reserva- 
tions in mind the coefficients with the new series may be accepted as the best approxima- 
tions it is possible to obtain in the circumstances. 

All the coefficients of racial likeness found with the Sakkara Ist dynasty and Thebes 
11th dynasty series are given in Table 3. Fig. 1 is a reproduction of part of a diagram given 
by Risdon (1939, p. 137) for the twenty-two ancient Egyptian and related series with which 


he dealt, with the addition of the two series described in the present paper and that of | 


modern Cretans. The Sakkara series is seen to be an unexceptional member of the ‘Lower 
Egyptian’ constellation, having two insignificant coefficients and other close connexions 
* The characters common tc all the comparisons with the new series are L, B, H’, LB, J, NB, 100 B/L, 


100 H’/L and 100.B/H’. Others used in some cases are B’, U, S, fml, fmb and 100 fmb/fml, and for the coefficient 
between the two new series only G’H, NH, L, O,’, O,, 100 G’H/GB, 100 NB/NH, L, 100 0,/0,’. 














A. BATRAWI AND G. M. Morant 25 


with members of that group. On the other nand, the 11th dynasty soldiers from Thebes 
clearly represent an Egyptian population of an aberrant type. The direct comparison fails 
to distinguish this from that of the 9th dynasty series from Sedment. Woo (1930) had 
found that the latter stands apart from all the other Egyptian series, ard he pointed out 
that the Sedment bears a closer resemblance to a series of modern Cretans (von Luschan, 
1913) than to most of the ancient Egyptian series. The Thebes 11th dynasty series has 
a reduced coefficient less than 5-0 with the Cretan, though the latter has no other coefficient 
of this order with any of the other Egyptian series. 

6. The racial history of ancient Egyptian populations. The new evidence makes rather 
more precise the racial classification of ancient Egyptian populations given in earlier 
craniological papers in Biometrika. The 1st dynasty series of crania from Sakkara is the 
earliest in date that has been described representing the region immediately south of. the 
Delta. It is an unexceptional representative of that group which must have prevailed 





Insignificant 
Significant and <35 —--=-— = 
35-50)... pe socwconee 
e* xr Lachish = ie. 
+! ys ic. 2 
4 Rant Cretans 
Connexions “4! AN S 
nee ‘,! Abydos \Y Sakkar, .* (Modern) 
= a - “a! Royal Tombs 8 | (ist aon. ) 
Egyptian a +. (1st dyn.) is 
series we, , Vey. : ' RQ > ee hada. »t 
(early Sakige : 4 ~ Abydos oes Nedem?” 
predynastic . st Sd ae : — oy) (4th & stb = er ee Sedment 
18th dyn. ¢ eee XQ . 9th dyn.) 
to 18th dyn.) e Sees ra - > LD= Thebes” a ( y 
Ptglemaic 18th-20th dyn.) : 
Of ns se? Sree * Gizeh 
t< “——" ‘ / (26th-30th dyn.) 
~N oy of 
Na Tito lyn 
Me a eiTOe tim dF" Giant apy 
. b, eS Pe ome dee eS 
(Modern) 


Fig. 1. Reduced coefficients of racial likeness between the two new and other ancient 
Egyptian and related series of male crania. 


in the region—with only slight local and secular variants—from the Ist to the 30th 
dynasty and probably in both earlier and later periods as well. Such populations are said 
to be of ‘Lower Egyptian’ type. 

In the region to the south, round Thebes and Abydos, the population was of a second 
racial type from the earliest predynastic (Badari) epoch for which there is any adequate 
craniological evidence. This is called the ‘Upper Egyptian’, though it would be better 
to call it Southern Egyptian. The population became modified slowly down to some 
time about the 18th dynasty. The change was such that the ‘Upper Egyptian’ type 
of population came to bear a closer and closer resemblance to the ‘Lower Egyptian’, 
though the two groups remained clearly distinct. About the 18th dynasty there must have 
been a fairly rapid, if not abrupt, change in the racial composition of the population of the 
Thebes and Abydos region. Nearly all the series from there, of that and later dates, are not 
of ‘Upper’ but of ‘Lower Egyptian’ type. They diverge slightly from the populations of 
the region immediately south of the Delta, however, in the direction of the ‘Upper 
Egyptian’ type. Six of these series—viz. those from Abydos, Thebes and Denderah of 











26 Egyptian skulls 


dates ranging from the 18th dynasty to Roman times—are shown in Fig. 1. The Ist dynasty 
series from royal tombs at Abydos, also shown there, is an exception on account of its date. 
The obvious explanation of its peculiar position is that it represents an intrusive and more 
or less isolated community which was derived from the other centre of popuiation to the 
north. 

This accounts for twenty-two of the twenty-four series of crania considered. The classi- 
fication of these does not seem to necessitate reference to any non-Egyptian peoples. 
This is not so, however, in the case of the remaining two series, viz. the new one of 11th 
dynasty soldiers from Thebes and the 9th dynasty series from Sedment (Deltaic region). 
These two might represent the same population as far as can be seen from the direct 
comparison, and both stand apart from the ‘Lower Egyptian’ constellation of series (see 
Fig. 1). The fact that the 11th dynasty series from Thebes has a close resembiance to one 
of Cretans, which is of modern date, suggests that the two aberrant communities in question 
may have been derived from the crossing of ancient Egyptians with people from some 
European. or Asiatic source. ; 

The mean basio-bregmatic heights (H’), cephalic indices and height-length indices are 
higher for the 11th dynasty Thebes and Sedment series than for any other of the series 
considered. The types, defined by average measurements, of these two thus diverge from 
that prevailing in ancient Egypt in the direction of the ‘Armenoid’ type. Elliot Smith 
(1911 and elsewhere) supposed that intrusive ‘Armenoid’ aliens played a considerable 
part in modifying the population of the country and that ‘long before the time of the 
New Empire, Egypt was permeated from one end to the other with this foreign element’. 

Our interpretation of the evidence fails entirely to support this hypothesis. There is no 
need to suppose that any people foreign to the country played a substantial part in 
modifying its population from predynastic to Roman times. The communities represented 
by the 11th dynasty Thebes and Sedment series may possibly have been derived from the 
crossing of Egyptian and ‘Armenoid’ people, but they stand apart. The remarkable point 
is not that two out of twenty-four populations should be peculiar in that way, but that the 
remaining twenty-two show interrelationships which do not suggest any admixture with 
alien stock. They can readily be explained on the supposition that there was a steady 
transference of population from the Deltaic region to the region of Thebes and Abydos, 
where the population was originally of a somewhat different type, from early predynastic 
times to the 18th dynasty. About that time the movement must have been accelerated, 
and thereafter the populations of the two centres were almost indistinguishable in racial 
type. The racial history of ancient Egypt was of a simple kind. 

7. Summary and conclusions. This paper deals with forty-four male crania of Ist dynasty 
date from Sakkara and with fifty-five crania of 11th dynasty soldiers from Thebes. Indi- 
vidual measurements taken by Prof. D. E. Derry are given in appended tables. Judging 
from the rather small samples, the two populations represented exhibited the same order 
of variation, while both were rather less mixed in racial composition than the population 
of Giza from the 26th-30th dynasties. Mean measurements clearly differentiate’ the two 
new series from one another. Judging from characters considered singly, both series bear 
a close resemblance to some other ancient Egyptian series, and both are of ‘Lower’ rather 
than ‘Upper Egyptian’ type. Comparisons are made by the method of the coefficient of 
racial likeness, though decidedly fewer characters than the standard set of thirty-one used 
when possible are available for the purpose. The resulting relationships are shown in Fig. 1. 








eo Oo rf YY GB YW 














AprENDIXx II. INDIVIDUAL M 




































































| ia Le 
| ; | } b 
eee fee y lr H , 
poate L B {[Aur.ht.]| DB | U | (o"ai] | GB i J ie 4 NB | 10,74 
; } ! ' aa 2 oe 
hee Toc. ne 
. oo I 68-5 93 52°5 43 | 
; a oe 113 poe ee # 2 103 126 | 52°5 | 27. | 37°5 
Z | 84 | 1375 SAS ee | — | S03 | 225 | 305 
19 y 
1gI | 147 125 98 540 66 «6| «688 | | 45 23°5 | 3 
a 186°5 141°5 115 100 517°5 | 70°5 | 99 132 52 | 23°5 | 45 
16 180°5 | 139 116°5 tor , 503 | 735 97 ye | 54°5 a | 35°5 
17 189 | 139°5 = 106°5 ne = “i ae pa i 
18 179 134 —_ —_ - - = ~ = pean | my 
21 — 147 _— — - - + = Ts “= 
23 fer pt — _ * sad % “ = pa | sit 
27 —_ | 135°5 Paes - - 6 Poe. | 44°5 | 24 40 
| tees | 136 we | 3 | 35 [8 | wos -| | abs lf 
32 189°5 | 130°5 st 107°5 pel 77°5 a aa 56°5 | 26°5 39°5 
~ ie we 113 04 516 72°5 | yor 127 | 485 | 26 | 385 
7 moked | Xi 114°5 102 5t0°5 | 69 | (925 | 124 | 48 | 2 | 35 
_ “ “= 6 93°5 —_ 40 25 3 
ees 19° | 107 520:5 70 | 100-5 — | 485 | 245 | 40°5 
58 189 | 137 ad | a4 ae ad ee cet, a - sit 
i ‘o | rn 107 | 100°5 | 518 67? | go 126 | 46 | 2 40°5 
= a > _ s75 | 255] — 
( 18 138°5 | 785 | 97 | 57° 
9 178 \ mo 109 | 100 \ = 75°5 | pest | 50°5 | 23: 38 
7 9 138°5 — S a jade, pees ee 
bo a | 138°5 121°5 Int | 535 | 77°5 | ca | ad | 57 | Lae | yee 
87 181 | 139°5 — Pt RCE Beak: Pou Bp: 2a ge 
= Yen =e Me | = bon Pe) Se 
90 } 19375 | 138 : . pay vee = 
gl 192°5 147°5 _ IOI'5 | 535 | - | Eine sini | we | ne ae 
92 | 1785 | 132°5 Es =a | 5°7 ii BR: rs er ee 
98 183 134°5 a | me 2 oi 8 | 25°5 | 4! 
2 2 96 135 45°5 St - 
” cn — ote | m5 | 8 i 5 | 104 | 130 = |) =5§0°5 25 | 38 
pee oe ta 113 | to2-5 | 511°5 | 67 | 915 126°5 , 48:5 | 265 | 38:5 
ian hg syn 119 | 101-5 | §01? | 70-5 93°5 | 22-5; 50 | 26 | 385 
124 } ' ~ iS te " 2: ve 
128 185°5 | 139 : . 113 | 97°5 | ee | ZS. | ws | ms, = 28 30 
141 184? | 134 116 IOI-5 | 515 5 | 9 | 19 ; 
: | — | 37°5 
I 195 | 139 109 109°5 | 537 74 . = 2 
= 7. 0 |. 837 — ; — | — 7O'5 = 2. ae ae | > a 4 
: | “5 | 6-5 | 132 1 SEs. yee 
Sin ie 20 | tors | Seo | rs | oe | ties | goa | 48 
le ac Be 116 —|553 | 7 | ors| — | 33 | 2s | as 
184 194 | 140°5 9) | | 52 : e- a 33 
| 190 185 | 130 1125 | 1125 a 7 al 4 fees | pst 
Sasi 188-5 | 142°5 | | 5 @ | | iz 








* The measurements are defined in Appendix I. A symbol above in square brackets denotes either that the me 





s Il. INDIVIDUAL MEASUREMENTS OF THE FIRST DYNASTY MALE CRANIA FROM SAKKARA* 






































t | | | | | | 
| | | rosthion | B . B | 
NH, 43) NB | {0} (0) Or) | Smt | fm | 1007, | 1007 | 10 | % az || Le wir! | 
| 
aid ee Bead oot oa id hd Wad 
aes rose | | 
a ped een saa! fe = re Os Ae re a oa 
SPS | 2e5 | + S55 | 962 | 40 | 31°5 | wre | ee | 109°7 | 73°7 46°7 
52°55 | 27 | 37°5 | 30°5 | 96 | 38°5 | 305 | 747 | 71:5 | 104-6 69°9 50°4 
50°5 | 22°5 | 30°5 | 31°5 | 5a = os a eS Gee ae | 75°0% 44°0 
45 | 235 | 38 305 | 101 | 34 | 305 | 770 | 754 | toa 75°0 5272 
52 23°5 | 4! | a 96 |} 35 | 29 75°99 | 72:7 | 104-4 | 7i-2 45°2 
545 | 24 | 355 | 33°5 101 | 36 | 315 | 770 | 776 | 903 75°8 53°2 
2 oe — | 5 | 385 | 315 | 738 | 757 |) 97% = = 
- nang ae ce! Weer: ot — | We ae at ” ot 
— 275 | — - | — | 20 | _ | 109-3 ~ 
— — | — — H —_— | —_ | — = = — | —_— —_— 
— ee en | a a | 365 | 34 | — | — | soos | — | -— 
445 | 24 | 4° Py oe TS ee ee | | 70°4 53°9 
50 | 285 | go | 34 S| 95°5 40 31 721 | 68-9 104°6 75°6 | 50°4 
565 | 265 | 305 | 325 Tut | 34 «| 385 | 723 | Gog | 1045 | - | 469 
485 | 26 | 385 | 32 98 39 |: 31 72:9 737 | 989 | 71°8 53°7 
48 | 25) «| 35) 38s 96? 375 | 31:5 | 765 | 792 965 | 746 5271 
40 | 25 | 38 «| 38 90°5 31 29 _ | 68-6 | — | 68-4 53° 
485 | 245 | 40°5 | 32 | = 105 | 39 | 32 25 | 767 | 94°5 09°7 50°5 
5 27 ah, gar go? are se | 72°5 | 104°4 = 49°1 
46 | 25 | 40°5 | 32 95? 35 31 7775 \ 728 | 106-4 74°74! 53°2 
. £e o  Be es Bee - 2 + cee aa Bagi pth 3 
50°5 23 | 38 33 1OI°5 | 36 29 | 781 71-6 1090 75'1 40°7 
Jee Gar Fo Dia Be +3 ee = pa gee i a 7 rs 
| 57 | 225 | 40 | 34 105 | 36 | 265 | 71-0 | 72°3 | 98-2 731 39°5 
| Sedo | + Arle bm ie rss = = 
| — -}— | — |} — | 25 | 739 96:8 = — 
nee Wich? Becmtl Bains | 39 | 28 | 713 67-7 | 10573 - ~ 
— |—|—]|—-] i | 39 | 35 | 766 rE 110-9 7 = 
Pog tet cubis het a a © oe - 
a: a ee wee — | 735 | = es - as 
485 | 255 | 41 | 32°5 | 100 32°5 | 20°5 | 74°4 773 | 962 75°0 2-6 
50-5 | 25 | 38 | 31°5 105 | Ses > 7 eee 100-7 69°7 49°5 
48°5 26°5 | 38-5 | 31°5 99 37°5 | 30°5 799 | 73°0 | 109°5 73°2 54°6 
50 | 2 | 35 i 33 | 100 35 28-5 | 73° | 77°5 | 95:0 75°4 52-0 
545 | 23 — | — 975 | 37 | 205 | 749 | 71-7 | 104°5 7571 42-2 
49 | 3% | 36 | 32 | 85 395 | — | 72:87 | 7662 | 950 a3 57°} 
Ya foe Cel es 1 eee 38 | 295 | 71-3 | 67-7 | 105-3 | at | aa 
'S. -- Be Fel Behe eee — l=) — | Ces | — + — | — 57°5 
| §2°5 2 1S et ee Sct ee ee 73°73 | 68-3 107°4 80:8 | 51°4 
50 26 «| 38) «| 3375 | 100 | 33°5 | 28 77°2 | #7772 | roO3Ko 82-4 46°4 
53 25 43° | 37S | a _ en oe ae - = 81-0 | 47°72 
51 24 38 32°5 | 109 37 29 i 70-3 72°7 96-7 7 | 47°! 
| eft be SE Wigaha es enh, | 





enotes either that the measurement is one not nally included in the iene ates or ise that Prof. Derry’s meth 





























] | Lijit 
NB 0, fmb 
[ 100 WIP L| [ 100 0,’ | | Fl — 
46°7 86-6 78-7 About 18 years. Metopic 
50°4 81-3 79°2 Skull female? but pelvis definitely male 
44°0 79°7 ae _ 
52°2 80-3 89-9 Root of nose depressed. Slight hydrocephaly ? 
45°2 78-0 82-9 _— 
53°2 94°4 87°5 About 16 years 
pet abe | 87-8 ae 
ae oid 78-4 es 
. ome fr - 
53°9 80-0 88-9 — 
50°4 85:0 173 ie 
| 46°9 | wee 92-6 -— 
53°7 83°1 79°5 me 
52°75 |  go-0 84:1 About 19 years. Skull female? but pelvis definitely male 
| 53° | 81:6 93°5 | Distorted by grave pressure 
50°5 79°0 82-1 —_ 
491 _ — =~ 
53°2 79°0 88-6 About 18 years 
44°3 = —_ 
40°7 86-8 80-6 = 
“— a — Metopic 
39°5 85-0 73°6 on 
_— > — About 17 years 
as i 71-8 = 
_ ~~ 89°7 — 
— — _— About 18 years 
52°6 \° pees go'8 2 
| 49°5 | 82-9 87°5 = 
54°60 81-8 81-4 About 18 years 
52-0 | 85:8 81-5 | About 20 years 
42-2 | - 79°7 About 20 years 
57°1 | 88-9 — Metopic. Distorted 
| is 82-7 77°5 — 
57°5 Se = = 
| 51-4 | 80-5 82-4 Old 
| 46°4 | 88-2 838 _ 
47°2 | 87-2 — ig 
| 47°1 85°5 78-4 About 17 years. Negroid 
| — ! _— — — 














‘of. Derry’s method of taking the measurement does not accord with biometric practice. 





APPENDrx III. INDIVIDUAL MEASUREMENTS OF 





























&, &, U (@’H] GB J [V4, 1 
125°5| 129 112 502 72°5 102°5 128 51°5 
135 | 98 | 495 | 685 88-5 | 1205 | 49 
121 Im5 4 78 _ ~~ 57 
129 tog | 497 69 100°5 134 5° 
123 127 510 72°5 92°5 122 53 
137 | 115 yee 75 102 133 5° 
121 119 506 74 96°5 130 53°5 
130 | 140 | 535 | 70 985 | 130 5°°5 
130 116 _ _ —_ _- oth 
134 | 105 | 508 | 77 94°57] 1275 53 
144 106 | 498 | 795 96 132°5 55 
127 113 487 690°5 93 128-5 49 
125 1I5 508 75°5 — ony sas 
119 117 503 69 96 121 52 
130 108 | 495 69 go 125 3st 
129 118-5) 488 | 64:5 96°5 126 48-5 
119 119 501 71 99 133 52 
120 122 524 70°5 93 — 50 
11g 121 495 71°5 81 119 50? 
136 124 520 68-5 97 - 51°5 
137 132 541 75 99 131 5° 
730 | 108 | 512-5] 69:5 ae = 5° 
129 103 512 78 97 129°5 57°5 
138 106 500 74 95 124 53 
123 128 520 7oO 94 127°5 53 
133 | 116 | 530 | 74 94 54 
























































49 
133 | 126 | 523 | 77-5 97°5 | 138 57 
135 120 | 527 74°5 93 129 49 
135 | m1 —~ 76 98 128 53 
128 116 502 68 _— aoe 49 
140 118 525 68 105 — 52 
139 129 | 533 75 99 130 54 
136 124 520 | 695 94 129 48 
135 | 104 | 487 | 69:5 88 124 48°5 
130 112 507 65°5 91 124°5 50° 
134 | 103 | 506 | 72 97 132 52 
122 105 487 74°5 93 122°5 54°5 
124 120 501 71 — — 54 
135 | 108 | 491 | 72 92 122°5 49 
122 101 500 78 99°5 128 57 
122 116 511 67 _— — 52 
125 | 103 | 484 | 71-5 99 127°5 5°°5 
128 123 521 —_— =— ees 
128 108 508 _ me a= 
_— — — 73°5 94 SI's 
125°5| 128 508 67? — 50 








E 
fn 





REMENTS OF THE CRANIA OF ELEVENTH DYNASTY SOLDIERS FROM THEBES* 




































































; B H’ B a’ 
s | pa, za} we | (0,1 | toa | Prehien | 100 F|1007-| x00 57, | [100 FF 
(28 51°5 24 38°5 31 101 5 765 100-0 7O°7 
[20°5 49 23 38 32°5 88-5 81:6 79°6 102-6 77°4 
— 57 225 | 39 37 95 730 | 7471 98°5 — 
[34 5° 30 42 32 108-5 758 | 772 98-2 68-7 
[22 53 23 38 33 93°5 747 | 766 97°5 78-4 
[33 5° 25°5 | 385 | 35 102 714 | 74°6 95°7 73°5 
130 53°5 25 41 37? 95°5 753 | 74:2 | 102-2 7&7 
130 50°5 25 42°5 | 34 99°5 JOO | 72°5 96°5 711 
_ _ — — — _ 74°72) 71°7 104°1? _ 
127°5 53 24°5 | 37 33°5 89 780 | 79°! 98-6 81-5? 
132°5 55 25 415 | 35 100°5 709 | 79°! 88-9 82:8 
128-5 49 26 41 30 93 760 | 80-6 94°3 74°7 
— 55°5 26 39 37 98 76°3 | 72:2 | 105:7 o 
121 52 : 39 29°5 95°5 75°5 | 70°38 | 1067 71-9 
125 51 24 38 29 g1 730 | 7471 98°5 76°7 
126 485 | 24 38 32 94°5 735 | 752 | 978 8 
133 52 25 39 32 91 757 | 75% | 102-2 77 
—_ 50 30 38 33°5 96 7773 | 697 | 1108 75°8 
119 50? 20 39 33°5 92°5 80:6 | 81-7 98-6 88-3 
a 515 27 39 32°5 92 774 | 74:5 | 1040 70°6 
131 50 255 | 40°5 | 32 104 71-9 | 69°9 | 102-9 75°7 
ones 50 24 41 34°5? 97 779 | 75°9 | 102-5 8 
129°5 57°5 24°5 | 41 36°5 91°5 79°5 | 80-9 98-3 80-4 
124 53 24°5 | 37°5 | 34°5 95°5 730 | 78-4 93°2 779 
127°5 53 23 39 35 89°5 755 | 74% | roi 74°5 
aes 54 24°5 | 40 33 98°5 "6 | 75°5 | 101-4 78°7 
127°5 53 24°5 | 37°5 | 32°5 —_ 80°5 —_ _ 82:5 
oe 49 — “ne ie 103°5 778 | 77°5 | 1004 69°5? 
138 57 27 4° 37°5 103°5 728 | 77:2 94:2 79°5 
129 49 26 38°5 33°5 100 82-5 76°7 107°6 80-1 
128 53 24 49°5 | 35 96 776 | 73°9 | 104°5 77°5 
= 49 25 37°31 30 95°5 77% | 73°5 | 104°9 _ 
- 52 27 39 32°5 92°5 75°83 | 73°4 | 103-2 64°8 
130 54 22 40 32°5 97°5 745 | 69°3 | 107°6 75°7 
129 48 24°5 | 385 | 32°5 104 7Ut | 73°4'| 968 73°9 
124 48°5 24 36°5 | 29 100°5 760 | 78:9 96°4 79°0 
124°5 5° 30 38 3! 97°5 728 | 72:5 | 100-4 72-0 
132 52 25°5 41 36 96 81-1 77° 105°2 34°2 
122°5 54°5 23°5 | 38 33°5 96 775 | 780 99°3 80-1 
_ 54 ae 385 | 32 94°5 74°7 | 7270 | 103-8 —_ 
122°5 49 24 37 29 89°5 76:8 | 80°5 95°4 783 
128 57 25°5 | 42 33°5 96 75°83 | 74:7 | 1015 78-4 
— 52 26 42 33 102 761 71:8 105°9 _ 
127°5 50°5 24°5 | 37°5 | 315 98°5 768 | 79°1 97°! 722 
= a an _ — —_ 785 | 719 | Togr oe 
_ = <= = — = 727 | 732 99°3 — 
~ 51°5 26 37°5 | 32 ee so “e = 78-2 
= 50 — 38 35 92°5 767 |. 78:3 97°9 — 
— — — — oo — 78-2 75°5 103°6 —_— 
= = pas “si — = 75°6 | 76:22) 99:3? = 
sea a oe #06 ae iz 75°7 = oe S 
— — — — — — 80-2 78°5 102°2 — 
_ _ _ _ _ _ 78-0 78-0 100°0 — 
— oe — _ _ = 75°9 | 73°8 | 103-0 | sie 








easurement is one not usually included in the bic 





tric technique, or else that Prof. Derry’s method 
associated with particular skeletons and hence the numbers used exceed fifty-nine. 





ENTH DYNASTY SOLDIERS FROM THEBES* 











Prosthion a B a NB ] o] 7 
GL} 100 | | 100 L | 1° H 100 BR [ x20 Wa’ L 100 0,’ R i 
101 5 | 765 100-0 70°7 46-6 .80°5 = 
88-5 1:6 | 796 | 102-6 77°4 46°9 85°5 = 
95 73° | 74:1 985 = 39°5 94°9 : - 
108-5 75°8 72 98-2 68-7 60-0 76-2 Metopic 
93°5 747 | 766 | 97°5 784 43°4 86:8 Metopic 
102 714 | 74°6 95°7 73°5 51-0 909 on 
95°5 75°8 | 742 | 102-2 767 40°7 go-2? ~— 
99°5 72°5 96°5 71 49°5 80-0 ae 
_ 74°7?| 707 104°1? — _ _ _ 
89 78-0 | 79°1 98-6 81-5? 46-2 90°5 - 
100°5 709 | 791 88-9 82-8 45°5 84°3 Left parietal fractured 
93 760 | 80-6 94°3 747 53°! 73°2 — 
98 763 | 72:2 | 105-7 we 46°8 94°9 on 
95°5 75°5 | 70°38 | 106-7 719 48-1 75°6 = 
91 730 | 741 98°5 76°7 471 ‘763 — 
04°5 73°5 || 75°2 97°8 66:8 49°5 84-2 - 
9! 76-7 751 102°2 717 481 82:1 _ 
96 7773 | 69°7 110°8 75°8 60-0 88-2 Metopic 
92°5 80:6 | 81-7 98-6 88-3 40-0? 85-9 _ 
92 774 | 745 | 104-0 70°6 52°4 83°3 - 
104 719 | 69°9 | 102-9 75°7 510 79°0 ~~ 
97 779 | 75°9 102°5 _ 48-0 84-2? Fractures in parietal, frontal and orbital regions 
91°5 79°5 | 80-9 98-3 80-4 42°6 89-0 ai 
95°5 73°0 | 78-4 93°2 779 46-2 92-0 ; = 
89°5 755 | 742 | rors 74°5 43°4 89°7 Metopic 
98°5 766 | 75°5 | 101-4 78°7 45°4 82:5 Metopic 
80-5 _— _ 82°5 46°2 86-7 Metopic 
103°5 778 | 77°5 | 100-4 69°5? — ni Metopic 
103°5 728 | 772 94°2 79°5 47°4 93°8 i - 
100 82-5 76°7 107°6 80-1 53°1 87-0 Metopic 
96 776 | 73°9 | 104°5 775 45°3 86-4 Metopic 
_ 78-92 oo a _ _ Metopic 
95°5 77% | 73°5 | 1049 - 51°0 80-0 - 
92°5 758 | 73°4 | 103-2 64:8 519 833 av 
97°5 745 | 69°3 | 107-6 75°7 4°°7 81-3 = 
104 | 7IK | 73°4 96°8 73°9 51-0 84-4 — 
100°5 760 | 78-9 96°4 79°0 49°5 79°5 — 
97°5 72:8 72°5 100°4 72:0 60-0 81-6 _— 
96 81-1 77°1 105°2 44°2 49°0 87-8 _ 
96 775 | 78-0 99°3 80-1 43°! 88-5 = 
94°5 747 | 720 | 1038 — = 83-1 = 
89°5 768 | 80-5 95°4 78°3 49°0 78-4 el 
96 758 | 74:7 | 101s 78-4 44°7 79°8 — 
102 761 71:8 105°9 — 50-0 78-6 _ 
98°5 76°8 79°1 97°1 72-2 48°5 84°3 Metopic 
—_ 78°5 71-9 1091 _ — =e Metopic 
~ 727 | 73°2 99°3 — — = Metopic 
_ _ _ — 78-2 50°5 85-4. Metopic 
92°5 767 |. 78:3 97°9 _ _ g2°1 =—_ 
— 782 | 75°5 | 103-6 — = — — 
_ 75°6 76-2? 99°3? _ _ — _ 
— 757 —_—— — — — —— —_ 
_ 80-2 78°5 102-2 _— _ _ —_ 
_ 78-0 78-0 100-0 — — _— ao 
> 759 | 73°8 | 103-0 - ag — — 





























jiometric technique, or else that Prof. Derry’s method of taking the measurement does not accord with biometric practice. 
the numbers used exceed fifty-nine. 











A. BaTRAWI AND G. M. Morant 27 


The Sakkara Ist dynasty series, which is the earliest from the region immediately south 
of the Delta, is an unexceptional member of the ‘ Lower Egyptian’ constellation, and it can 
be supposed to typify the population of Northern Egypt at the time. The 11th dynasty 
series of soldiers from Thebes is linked to the same group, but it diverges from it. The 
type is indistinguishable from that of a 9th dynasty series, from Sedment. The former also 
has a link with the type of a series of modern Cretans. The two aberrant communities of 
Thebes and Sedment must be supposed to have been derived from the crossing of ancient 
Egyptians with people from some European or Asiatic source. Our knowledge of the racial 
history of ancient Egypt derived from craniological evidence is reviewed. 


REFERENCES 


Davin, A. G. & Pearson, K. (1924).. On the biometric constants of the human skull. Biometrika, 16, 328-63. 

Duckworts, W. L. H. (1913). International Agr ts for the Unification (a) of C1 tric and Cephalometric 
Measurements, (b) of Anthropometric Measurements to be made on the Living Subject. Camb. Univ. Press. 

Luscuan, F. von (1913). Beitrage zur Anthropologie von Kreta. Z. Ethn. 14, 307-93. 

MacraMaLiaH, R. (1940). Un Cimetiére Archaique de la Classe Moyenne & Sagquara. Imp. Nat. le Caire. 

Morant, G. M. (1925). A study of Egyptian craniology from prehistoric to Roman times. Biometrika, 17, 1-52. 

Morant, G. M (1935). A study of predynastic Egyptian skulls from Badari based on measurements taken by 
Miss B. N. Stoessiger and Prof. D. E. Derry. Biometrika, 27, 293-309. 

Rispon, D. L. (1939). A study of the cranial and other human remains from Palestine excavated at Tell Duweir 
(Lachish) by the Wellcome-Marston archaeological research expedition. Biometrika, 31, 99-166. 

Srrat, S. (1912). Crania Habessinica. Contributo all’ Antropologia dell’ Africa Orientale. Rome: Loescher. 

Smiru, G. Exuior (1911). The Ancient Egyptians and their Influence upon the Civilization of Europe. Harper. 

Stroxrssicer, B. N. (1927). A study of the Badarian crania recently excavated by the British School of 
Archaeology in Egypt. Biometrika, 19, 110-50. 

Tuomson, A. & Ranpatyi-Maclver, D. (1905). The Ancient Races of the Thebaid. Oxford Univ. Press. 

Wintock, H. E. (1928). The Egyptian Expedition, 1925-27. Bull. Met. Mus. Art, New York. 

Woo, T. L. (1930). A study of seventy-one ninth dynasty Egyptian skulls from Sedment. Biometrika, 22, 65-93. 








Appenpix [. DEFINITIONS OF MEASUREMENTS 


Individual measurements of the two series of crania are given in Appendices II-III. The 
contractions used there and in tables in the text to denote characters are: 

L = maximum glabella-occipital length. B= maximum horizontal breadth. B’ = mini- 
mum frontal breadth. H’ = basio-bregmatic height. Aur. ht. = ‘vertical height from line 
joining highest points of external auditory meatuses’. LB = basion to nasion. U = maxi- 
mum horizontal circumference above the superciliary ridges. S, = arc nasion to bregma. 
S, = are bregma to lambda. S, = arc lambda to opisthion. S = total sagittal arc from 
nasion to opisthion. Broca’s Q’ = transverse arc from ‘the most prominent point on the 
posterior root of the left zygoma, exactly above the auditory aperture’, to the same point 
on the right passing through the bregma. fml = basion to opisthion. fmb = maximum 
breadth of foramen magnum. G’// = nasion to alveolar point. GB = facial breadth between 
lowest points on zygomatico-maxillary sutures. J = maximum breadth between zygomatic 
arches. NH, L =nasal height from nasion to point furthest removed from it on the 
margin of the left pyriform aperture. NB = maximum breadth of the pyriform aperture. 


O}, = breadth of right orbit from the dacryon. Q, = maximum height of right orbit. 
Prosthion GL = basion to prosthion. 





THE GENERALIZATION OF ‘STUDENT'S’ PROBLEM 
WHEN SEVERAL DIFFERENT POPULATION 
VARIANCES ARE INVOLVED 


By B. L. WELCH, B.A., Px.D. 


1. Introduction and summary. Let 7 be a population parameter which is estimated by an 


k 

observed quantity y, normally distributed with variance 02. Let o? = > A,o7, where the 
i=1 

A; are known positive numbers and the a? are unknown variances. Suppose that the observed 


data provide estimates s? of these variances, based on f; degrees of freedom, respectively, 
so that the sampling distribution of s? is 


‘f.g2\*si-1 2 3? 
wired = ld) oa YS]. 0 
and that these estimates are distributed independently of each other and of y. 

A very simple particular case of this set-up occurs when we have samples of n, and n,, 
respectively, from two normal populations with true means a, and @, and standard devia- 
tionsa, anda,. If 7 is the true difference («, — a.) between the means, the estimated difference 
is y = (%,—%,). The variance of the estimate is o% = (A,o?+A,03), where A, = 1/n, and 
Az = 1/nq. The estimated values of o? and o? are s? = 2,/f, and s3 = Z,/f,, where 2, and 2, 
are the respective sums of squares.of observations from the individual sample means and 
f, = (m,—1) and f, = (n,— 1). These s* are distributed in the form (1) and the postulated 
conditions of independence hold. 

Another particular case, again with k = 2, arises when we wish to compare two regress‘on 
coefficients, fitted to independent sets of data, without making the assumption that the 
population residual variance about the true regression line is the same for both sets. 

The present paper is written mainly with these practicai applications of the case k = 2 
in mind, but the results are expressed generally for any k, since no further analytical diffi- 
culties are involved. It will be shown how probability statements about y, considered as 
an estimate of 7, may be made similar in character to those which W. S. Gosset derived for 
the mean of a single sample of n observations (‘Student’, 1908). We shall, in effect, seek a 
quantity h, calenlable from the observations, with the property that the chance of the 
difference (y—7) falling short of A is a given probability P. It is clear that A must bes 
function of the individual variances s? and of P. If the abbreviation Pr. is used to mean 
‘the probability of the relation in the bracket following’, our problem is to satisfy the 
equation 

Pr. [(y— 9) < A(s3, 83, ..., 8%, P)] = P. 


In Gosset’s case the solution was, of course, simply 
Pr. [((—a) <tps//n] = P, 


where t,, is the value, corresponding to the probability level P, in the ‘Student’ ¢-distributio# 
with f = (n— 1) degrees of freedom. 


BEST COPY 
AVAILABLE 


In 
is the 
solut 
com] 


wher 
then 


wivel 


it be 


‘-is 





B. L. WELCH 29 


In the next section the mathematical derivation of the exact solution of (2) is given. This 
.« then followed by some consideration of its expression in numerical terms. First, a series 
olution in powers of 1/f; is developed, which may be used for calculating tables. Then some 
comparisons are made with a non-series approximate solution which is based on a particular 
wav of regarding the distribution of a quantity of the general form z = (2a; yx). 

some brief discussion is then added which may serve to place the present contribution 
in its proper relationship to other papers which have been written on this topic. 

Finally, it is shown how the inequality (2) may be adapted to provide an interval estimate 
for }.- 


>. Mathematical derivation of solution. Let j(s?, s3, ...,8%, P) denote the probability that 
1 — 9) is less than A(sj, 85, ..., si, P), given s? (i = 1,2, ...,k). Then, since y is distributed quite 
independently of the estimated variances, we have 


Me, Oe,...-503F) 
(ZA.o° 1 “ h(s?, 82, ..., 82, P 
j(82, 83, ..., 8%, P) = | ia (2A,0') Tomy’ “dy = 1 “EA, 08) * N, (4) 
where J is used to denote the normal probability integral. The condition of equation (2) is 
then simply that, if j(s?, s3, ...,8%, P) is averaged over the probability distributions of s? as 
viven by (1), the result will equal P. Thus 








[ ---[itet st ...sb P) IL pte dst = P. (5) 
Now we may expand j(s?, s3, ..., 8%, P) about an origin (0%, 03, ...,07%) in a Taylor expansion. 
Thus 


j(8}, 83, we 3 P) = exp [2(s? — 7) O:)j(Wy, We, seey Why P), (6) 


it being understood that the exponential is to be expanded in a power series in 0; and that 
‘, is to be interpreted so that 


7) 
OF j(W,, Wo, ..., Wy, P) = Fog Hl «1 Ope P) (7 
J(W,, We n> P) | sales 2 ke 5 ee ) 
On making the substitution of (6) into (5) our result may be written 
Oj(w,, We, ...,W,, P) = P 
where © = [] |exo[(s?— 09)2, 1p(s9) dst. 
Now, substituting into (9) from (1), the integral comes out in simple form, i.e. 


= I \!- exp[—9o70;] 


ey Wi 


= exp| —Zot?, — fi log (1-F")} 


2070; 


ptt 4,088 of? 


= ex +—-F —— + 22 — 7 eto! 
i>) a fi fi 


a [A ofat 1 (82h . 
atsZ 7, ths ae +5(2% 7, t)'| +ete. (10) 


BEST COPY 
AVAILABLE 





30 A generalization of ‘Student's’ problem 
Substituting (4) into (8) we have finally 
h(w,, We, ...,W,, P) 
e])—— 2 = P. 11 
VEX) si 
This, in a very condensed form, is the solution to our problem.* The operator 6 constitutes 
a direction to carry out the partial differentiations indicated by (10). w; must then be equated 


to 3. The solution of the resulting equation will give h(o7, 03, ..., 0%, P) and therefore the 
hippie h(e?, sf, ..., of, P). 





3. The development of the series solution. It will be convenient to write h(w) for 
h(w,, Ws, ...,W,, P) and £ for the normal deviate such that J(€) = P. We may then expand 


I ee) in a Taylor series about & as origin. Thus 


|V(2A;6?)} 
_ hw) | _ h(w) 
N7Ei.0) exp| (en a3) -4|D| 10), (12) 


it being understood that the exponential is to be expanded in powers of D, and that these 
powers are to be interpreted so that 


DrI(v) = Eo) wil (13). 


Equation (11) then becomes 


oul tay] 0 


This may now be solved by successive eel 
The initial approximation is the large-sample normal approximation 


hy(w) = E /(ZA;w,), (15) 


h(w) = & (2A, w,;) +hy(w) + h,(w) + ete., (16) 
where h,(w) inciudes terms of order 1/f;, h.(w) terms of order 1/f? and so on. For the moment 
we shall treat terms of the order 1/f? as negligible. Then (14) gives 


Oexp sug ies p| {* soa D | 1.) I(v)= (17) 


1.€. @exp| ED| [5 2A; w;, % =—1}| [1+ Ja mt eon ee “hw = I(&). (18) 


Or, using (10), 


ce x0(60|,/sxiet—3)) [2 


and we may write 














h(o%)D | 1hi(o2) D® pated. xo (eo! Feo ) h,(w) 


J (SA,o3) 12 FA} 1? f, \W DA,o2 |) \(2A,0%) 
4 080 1 /5.0%@%)? { /ZA,w;_,) 7 
ce et a(2 9) [P(E |, sxot |) 10) =° 


The equation of the first order term to zero gives 
» Atot 
&(1+&) (2 Sit 
2 : ° 
Mo) = a Bao 


* Equation (11) can also be expressed as an integral equation and this form may be necessary for 
providing numerical values where the f; are very small. 





(20) 


BEST COpy 
AVAILABLE 





the 


Ph 


B. L. WELcu 31 


his can then be substituted in the second-order term which, when equated to zero, will 

sive ho(o?). The process may obviously be extended to higher orders, although the expressions 
;ecome so complex that a slightly different procedure has then been found to be preferable. 
+» terms of order 1/3 our solution is 


| ( psi yh s} 
(1+ 82)\7 fy a+ ey| ft) 
2) — 82 : 
eis a a 
( se) ( ste) 
3456+ E9\" FE) (15432874 968) (77, ai 
3 (2A, 83)° 32 (2A;8?)* J 
It may be noted that in the particular case k = 1, this reduces, as it should, to the already 


known expansion of the deviate of the straightforward ‘Student’ distribution (Fisher, 
1941, p. 151), viz. 








tp = ef 14055) ee Set tet. |. (22) 


96f2 
It is proposed in another communication to give tables of h(s?) based on the expansion 
(21) carried to some further terms. 


4. Discussion of a non-series approximation. It will be recalled that in Gosset’s original 
approach to the single sample problem (‘Student’, 1908) his initial step was to note that the 
tirst four moments of the distribution of s? were consistent with.the assumption that the 
distribution could be represented by a Pearson Type III curve. He was fortunate in this 
way to rediscover a distribution which had already been found by Helmert, as this permitted 
him to go on to the derivation of the t-distribution. In our present case, as in many others 
arising naturally in statistical work, we are led to consider, instead of s®, a linear function 
“A; 83 of several s?. If this linear function were distributed in a Pearson Type III distribution 
a whole range of new problems could be dealt with by well-established theory. However, in 
yeneral, we do not have this good fortune. For 2A, 8? is of the form 2a; x?, where a; = A,;0?/f;, 
and the distribution of this quantity is only of Type III if all the a;, except one, are zero, o1 
if all the a; happen to be equal. 

Nev svtinlen, for practical purposes an approximation to the distribution of 2A, s?, using 
a Type IIT curve with start, mean and variance suitably adjusted, can still be useful. In two 
previous papers (Welch, 1936, 1938) I have employed this method to obtain numerical com- 
parisons of the merits of different statistical procedures, where full calculations with the 
true distributions would have been unduly laborious. The method of determining the con- 
stants in the a temmceennd was given for the case k = 2 in the first of these papers and is 

as follows. 


If = (ax?+ 6x3), and the approximate distribution curve is written in the form 


p(z) dz = ran 7 e-Helo) (=)"a(). (23) 


then making the first two moments of (23) agree with the true moments of z, we find 


p= hth? a*f, +b%fy 
a*f, +b,’ 9 af, +f,’ 


Phrasing the matter rather differently, we can say that z/y is approximately distributed as 


(24) 


BEST COPY 
AVAILABLE 





32 A generalization of ‘Student's’ problem 


x? with degrees of freedom f. Of course f, given by (24), will in general be fractional, but the 
letter used to designate this quantity was chosen, and the term ‘effective degrees of freedom’ 
has been used, because by doing so we can appeal immediately to a considerable body of 
further theoretical results. 


In particular we can say that the criterion 
(y—7) 











= 25 
Wish Ae is 
follows approximately the ‘Student’ ¢-distribution with degrees of freedom 
f aa (AyoF + A,0%3)* - 
Ajai , Aios (26) 
fi of 


More generally, when k is not restricted to 2, the same line of argument leads us to say 
that the criterion 





_ (y—%) 27 
° = EA 89 
is approximately distributed as ‘Student’s’ t with degrees of freedom 
f= (2A;o?)? 
» iter 2 4 "a 
shit (28) 


Not knowing thea;,,’s in (28), there are several ways in which we may now proceed, depending 
on what weight we may be willing to attach to any vague a priori notions we may possess 
of their relative magnitudes (cf. Welch, 1938). If we are not willing to assume anything, 
perhaps the best choice is 








ofp Att 
ms (2A, 83 a) + 3) (29) 
(27.53) 


It may be shown that the numerator of (29) has, in repeated samples, an average value 
(XA,o?)?, and the denominator has average value 2A?04/f;. In a certain sense, therefore, 
(29) is a fair estimate of (28). 

To sum up, then, the interpretation of y as an estimate of 7, using the present type of 
approximation involves only the reference of the criterion (27) to tables of the ‘Student’ 
distribution, entered with degrees of freedom given by (29). 

Some further light is now thrown on this procedure by the expansion for the exact solution 
of our problem derived in the preceding section. For the implications of referring v to the 
‘Student’ distribution may be seen by substituting f from (29) into the expansion (22) of 
the ‘Student’ deviate. On doing this and then expanding in powers of 1/f; it is found that, 
in effect, our approximation corresponds to assuming that 


A284 Az 34 
roi Fg 
= a+ey| fi a+e| fi 








isi 2 
mic =) 
+ oa Cae t |? (30), 


whereas, in fact, the true solution is given by (21). Comparison shows that we have exact 








ag! 
the 


~~. & 4 4» = 


an Ln eo 2B eee ot. a 





e 
f 





B. L. WELCH 33 


agreement to terms of order 1/f; and in the first of the quadratic terms. To the second order 
the difference between the expressions in square brackets in equations (21) and (30) is 


ovseoen (EH) CUI} 


3 (2A; 87)* (2A, s8i)* 
This difference vanishes if any one of the s? is overwhelmingly larger than all the others, or 
if s} is proportional to f,/A,;. It appears that, in general, the difference is not likely to be large. 
We have, therefore, found some justification for using the Type III approximation in the 
present case. 

The above comparison has been made on the basis of the series developments, but it should 
be borne in mind that approximations based on positive frequency functions, such as those 
falling under the Pearson system, usually provide a higher degree of accuracy than might 
appear from any consideration of expansions. Furthermore, they are apt to give an insight 
into the nature of the situation which may sometimes be lost in working out the details of 
exact colutions. In the present case I feel that the comparison of this section serves to give 
added confidence in the exact solution,* which I have put forward in the previous two 
sections, quite as much as it demonstrates the value of the approximate method. 


(31) 


5. Further discussion. In comparing the present contribution with other work on the 
subject, the essential point to notice is the averaging process involved in equation (5). We 
are not trying here to make probability statements valid for fixed s?, but are averaging over 
the joint probability distribution of the s?, taking into account, therefore, the different 
values which can arise by chance in sampling from populations with fixed o7. 

This averaging over the joint distribution of the s? is parallel to the step taken in 
Section ITI of Gosset’s original memoir (1908) where, in effect, he starts with the distribution 
of t for samples with fixed s and then averages over the distribution of s which he has already 
derived earlier. He thus arrives at the unrestricted distribution of ¢ (or, more strictly, of a 
quantity z, which is equal to ¢ multiplied by a constant). This distribution forms the basis 
of the significance tests which he illustrates in his Section [IX and of the method of deriving 
interval estimates for the population mean which he outlines in his Section VIII. 

In the present paper the parallelism with Gosset’s work may be obscured to some extent 
by the fact that we do not from the outset seek the probability distribution of some pivotal 
quantity like t, explicitly expressed. It so happens that we are able to proceed to a method 
of deriving an expansion for the required probability level without making explicit reference 
to such a quantity. Nevertheless there remains the important resemblance with Gosset’s 
development, in that we do not confine ourselves to samples with fixed s?. 

This procedure stands in sharp contrast to the formulation of the problem of comparing 
two means, favoured by R. A. Fisher (e.g. 1941) and H. Jeffreys (1940). These writers 
prefer a solution which they ascribe initially to W. U. Behrens (1929). Looked at from one 
point of view, Behrens’s paper appears to contain some gross algebraical errors. Fisher and 
Jeffreys, however, develop lines of argument by means of which they claim that Behrens’s 
solution is quite justified. It seems to me difficult to say how far (if at all) any of these 
arguments may have been in Behrens’s mind when he wrote his paper and I shall not 
attempt to elucidate this question here. We may, however, permit ourselves one observation 
about the developments according to Fisher and Jeffreys. 


* Exact in the sense that it is indepéndent of the irrelevant population parameters ¢;*. 


Biometrika 34 3 








34 A generalization of ‘Student's’ problem 


Both these writers, at some stage, limit the field of their probability inferences to a sub- 
set in which the s? are regarded as fixed. In order to solve the problem on these lines Jeffreys 
introduces an a priori distribution function for the unknown ¢;, following his general 
philosophy for dealing with such questions. Fisher, on the other hand, arrives at the same 
answer by a special utilization of what he terms the fiducial distribution of o;. 

Jeffreys’s approach here does not raise any new issues to those who are familiar with the 
general body of bis researches on statistical inference. Fisher’s justification of Behrens’s 
solution is perhaps of more immediate interest as it raises controversial points which are 
important more specifically in relation to our present topic of discussion. For although 
Fisher’s approach has been very much criticized by a number of writers, starting with 
M. S. Bartlett (1936), the critics have not wished to throw doubt on the whole body of 
results which Fisher includes under the heading of fiducial inference. The criticism has been 
for the most part selective, directed mainly at the way in which so-called simultaneous 
fiducial distributions of several parameters have been defined and manipulated. 

I have, myself, quite definite views on these questions (particularly on the usage of the 
word ‘fiducial’) but do not feel that I need express them at any great length here. I dis- 
agree with Fisher, but this divergence of opinion must already have become apparent in 
the way I have defined the field within which I make my probability inferences about 7. 
It appears to me to be quite artificial to restrict our view to one which, even in a limited 
sense, fixes s?. It is true that, in the two-sample problem, we have to draw our inferences 
from the unique pair of samples observed, or, more precisely, from the statistics %,, %,, s? 
and s} which they provide. These statistics are our only data for the purpose of making 
inferences, but we add something to these data in the interpretation when we regard the 
samples as being drawn randomly from hypothetical normal populations. Once having 
embarked on this method of interpretation, we should stick to it consistently throughout. 
The sampling variations of s? should be taken into account only by a direct use of the 
probability distributions as given by our equation (1) and not by any inversion such as is 
involved in Fisher’s conception of the fiducial distribution of 73. As we have seen, it is 
quite possible to make probability statements about the difference between the population 
means without making any reference whatever either to inverse probability or to fiducial 
distributions. 

The distinction between the procedure which Fisher advocates and one which averages 
over the s? distributions has, of course, been stréssed by most of the writers who have 
contributed papers on the subject, from whatever viewpoint (e.g. Bartlett, 1936, p. 566, 
and Yates, 1939.) What has been lacking hitherto, however, is a solution, analogous to 
Gosset’s single sample solution, which makes complete use of the information contained 
in the data provided. Bartlett indicated one particular way in which probability inferences 
about the difference between two population means might be made, but was careful to point 
out that the problem of making the best possible inferences (in the theoretical sense of 
utilizing all the information in the data to its full extent) was still an open one. There has 


indeed been some doubt expressed whether a fully satisfactory solution from this point of ° 


view existed at all. I believe, however, that the one I advance above in equation (11), 
and develop in equation (21), meets all the requirements that one can reasonably expect. 

Whatever conclusion the reader may come to on these matters, however, he will probably 
wish to know how, in the numerical details, this solution will differ from that of Behrens. 
This will be more easily seen when some tables become available, but fortunately certain 








ee 








B. L. WELCH 35 


comparisons can already be made. For Fisher (1941, p. 155) has provided a series ex- 

pansion of the Behrens solution. In our notation, and with k= 2, this may be written, to 
order 1/f;, as follows: 

Ast Abst 

(1+) \ fi fe (; -) A, Ag8783 

(8%) = EAs? +A, 8? > + 5 beeen T. 2 

Ma) = EXT ae ta Rate ae WTA) Otte) | O 

Even to this order, this differs from our equation (21) in the inclusion of an extra term. In 

other words, although the two solutions are the same when samples are large enough to 

adopt the large-sample normal approximation, they differ immediately we take into account 


the first corrective term, i.e. they differ as soon as we begin to attach any importance to 
‘Studentization’. 





6. An interval estimate for 7. We have shown in §§2 and 3 how to calculate a value 
h(s?, 83, ..., 8%, P), depending on the observed variances s?, s3, ..., sz, such that the probability 
is P that (y—7) < h(s?, s3, ...,82, P). This provides a method of testing the consistency of an 
observed y with a prescribed value 7. 

When the question is not whether any particular given 7 is contradicted by the data, but 
rather one of estimating 7 and at the same time of providing a measure of the uncertainty 
of the estimate, the further step required is immediate. For, as in the case of a single sample, 
the order of the words in our probability statement can be changed so that it becomes—the 
probability is P that 7 is greater than {y—h/(s?, s3, s?, P)}. An interval estimate for 7 is then 
obtained by taking two levels P, and P, for P. Thus the probability is (P, — P,) that 7 lies 
between {y — h(s?, 53, ..., sf, P,)} and {y—h(s?, 83, ..., 82, P,)}. 

If P, = (1—P,) the range will be symmetrically placed about y. Thus, for example, if 
P, = 0-95 and P, = 0-05, the chance will be 90 % that 7 lies within the range 





/_ j2s4 
wit: : 
1 4-1-6440)" ( ) 
. 2 £ 
y + 1-6449 /(2A,s?)| 1+ ri (Ds)? +ete. |. (33) 


It may be noted, incidentally, that this range is always narrower than similar ranges 
calculated from Behrens’s solution. 


REFERENCES 


Barttett, M. S. (1936). The information available in small samples. Proc. Camb. Phil. Soc. 32, 560-6. 

BEeuREnNS, W. U. (1929). Ein Beitrag zur Fehlerberechnung bei wenigen Beobachtungen. Landw. Jb. 
68, 807-37. 

FisHEer, R. A. (1941). The asymptotic approach to Behrens’s integral, with further tables for the d test 
of significance. Ann. Eugen., Lond., 11, 141-72. 

JEFFREYS, H. (1940). Note on the Behrens-Fisher formula. Ann. Eugen., Lond., 10, 48-51. 

‘StupENnT’ (1908). The probable error of a mean. Biometrika, 6, 1-25. 

WE cu, B. L. (1936). Specification of rules for rejecting too variable a product, with particular refer- 
ence to an electric lamp problem. J. Roy. Statist. Soc. Suppl. 3, 29-48. 

WE ecg, B. L. (1938). The significance of the difference between two means when the population 
variances are unequal. Biometrika, 29, 350-62. 

Yates, F. (1939). An apparent inconsistency arising from tests of significance based on fiducial dis- 
tributions of unknown parameters. Proc. Camb. Phil. Soc. 35, 579-91. 








[ 36 ] 


THE DISTRIBUTION OF KENDALL’S 7 COEFFICIENT OF RANK 
CORRELATION IN RANKINGS CONTAINING TIES 


By G. P. SILLITTO, Pu.D., B.Sc., Research Department, I.C.I. (Explosives Lid.) 


A new coefficient of rank correlation has been described by Kendall (1938, 1942, 1943) and 
denoted by him as 7. This coefficient has advantages over Spearman’s p in respect of the 
smoothness of its distribution and the rapidity with which it approaches normality, thus 
facilitating significance testing, and in being readily adapted to cases of partial rank 
correlation. 

The distribution of 7 has been worked out by Kendall (1938, 1943) for cases in which 
neither ranking contains members which are graded equal, i.e. rankings containing no 
‘ties’. It is the purpose of the present paper to deal with cases, which frequently arise in 
practice, in which ties occur in one of the two rankings. The method is a generalization of 
that of Kendall and will be given in some detail for the case of tied pairs, while the results 
of further generalization to multiplet ties will be indicated without detailed proof, which can 
in all cases be effected simply on the lines indicated. 


DEFINITION OF T FOR RANKINGS CONTAINING TIES 

In counting the ‘score’ of a pair of rankings, by the methods suggested by Kendall, each 
member is compared with the other members-of the same ranking, and additions to or sub- 
tractions from the score are made depending on whether it is smaller or greater in each case. 
If some members are ranked equal then it is proposed that no change be made in the score 
in comparing them. This obviously accords with the intuitive aspects of ranking. Thus in 
the pair of rankings following, the score is + 8: 

2 4. 2:4:% 

213 5 6 3 


The maximum score possible is thus obviously reduced by the presence of ties, and it is 
evident that the presence of each tied pair reduces the maximum possible score by unity, 
so that it becomes $n(n — 1)—-p, for the case of a ranking of n members containing p, pairs. 
Thus for such a ranking 7 would be defined as 

T = 28/{n(n—1)—2p,}, 
where S is the observed score. 

Generally, each r-tuplet tie reduces the maximum possible score by }r(r—1)* so that for 
a ranking of x members containing p, pairs, p, triplets, ..., p, r-tuplets, 

ee 28 =: 
n(n —1)—2p,—6p,—...—r(r—1) p, 





THE SUM OF THE FREQUENCIES OF THE POSSIBLE SCORES 
When no ties are present, each permutation of the n members produces a possible score so 
that there are in all n! possible scores. When ties are present they decrease the number of 
possible permutations of an assigned set of members, but, on the other hand, they give rise 


* This result has been given by Kendall (1945! 








fu 


fr 








G. P. Smnurrro 37 


to further families of scores due to the different places in the ranking which can be occupied 
by the tied members. Thus, for instance, the rankings 


113456, 122456, 123356, 123446, 123455 

all give rise to the maximum score, 14. 

Considering any assigned ranking, the number of possible permutations with p, pairs 
present is n!/2”2, or if there are in addition p, triplets, ..., p, r-tuplets, n!/(2!)?2(3!)?s... (r!)?r. 

The distribution of scores of an assigned set of ranks will be referred to as the basic dis- 
tribution for the type of ranking concerned, since consideration of the possible ways of 
assigning the p, pairs, p, triplets, etc., among the members of the ranking has only the effect 
of multiplying the frequency of each score by a constant factor. This factor is th2 number of 


ways of distributing the p,+p,+p,+...+p, ranks among the n members. This is the 
number of possible permutations 





(Pit+Pet+P3t---+D,)! 
Pi! Po! pz!...p,! 


-BASIC FREQUENCY DISTRIBUTIONS OF THE SCORES 


The basic frequency distributions can be established by an extension of the methods given 

by Kendall. Considering first the case of tied pairs the frequency function of the basic 

distribution of the scores may be written f(S,n,p,.), where p, is the number of pairs. The 

frequency generating function is then > f(S;, ”, p,) t8i. Now consider the addition of another 
I 


tied pair, with a greater ranking than any of the existing ranks. If it is added to the extreme 
left of the ;1nking it adds — 2n to the score. Moving one of the pair one place to the right 
adds 2 to this new score; bringing the other added member up to it adds another 2. Starting 
again with both the new members on the extreme left, movement of one of them two places 
to the right adds 4 to the new score, bringing the other up to it in two steps each of one place, 
adds successively a further 2 and 4. Proceeding in this way all possible additions to the old 


score which may be brought about by the addition of a tied pair of new members are repre- 
sented by the array 


—2n —(2n—2) —(2n—4) —(2n-—6) ... 0 
—(2n—4) —(2n—6) —(2n-—8) +2 
—(2n—8) —(2n—10) +4 

— (2n—12) 
+2n 


Thus the addition of a new tied pair has the effect of multiplying the frequency generating 
function by 


ae -"s (¢-2"-2) ee ¢-@n-8) a (¢-@n-® eM t—(2n—6) +. {—2n—8)) : a (¢° 3 {2 oo t")}. 


The addition of a single new member to the ranking has the effect of multiplying the 
frequency generating function by 


{- es t-(n—2) 4 $8 2. t”} 
as shown by Kendall, the presence of tied pairs in the existing ranking having no effect. 





38 Coefficient of rank correlation in rankings containing ties 


With these two recurrence relations there is no difficulty in drawing up a table of basic 
frequency distributions for tied pairs as exemplified in Table 1, in which only positive values 
of the score are shown, negative values being obtainable by symmetry. 


Table 1. Distribution of the score 8 for values of n from 3 to 7, and for rankings containing 
Pz pairs of members ranked equal (only positive half of symmetrical distribution) 





| 

















I-13 +16] AAInRD oon iiaiaal ww 




































































Values of S 

Pa 

0 1 2 3 4 5 6 7 8 9 10 | 11 | 12} 13} 14] 15 | 16] 17 | 18 | 19 | 20} 21 

F 3 1 
1 1 d 1 

6 P 5 3 1 
1 7 3 a 2 1 
2 2 ; 1 1 

22 20 15 ‘ 9 4 1 
1 ll ‘ 9 4 6 3 1 
2 6 5 4 : 2 1 

101 90 71 c 49 29 14 5 1 
1 | 52 49 ; 41 30 ‘ 19 10 4 1 
2 26 23 ‘ 18 é 12 7 3 1 
3 | 14 12 ; ll 7 5 2 1 
; . |573 . 531 - |455 . 1359 . 259 . 1169 . 98 , 49 4 20 : 6 : 1 
1 {292 . {281 . |250 . 1205 . 1154 - 1105 7 64 “ 34 : 15 5 5 ~ 1 4 
2 . 1146 - [135 . : 90 . 64 |-. 41 " 23 3 ll ; 4 ‘ 1 
3 | 74 j 72 J 63 i 52 ‘ | 38 ‘ 26 ‘ 15 é 8 : 3 n 1 ‘ 
| 








Before the construction of the table has proceeded far, however, it becomes evident that 
there is a recurrence relation between individual frequencies for any given value of n, such 
that the frequency of any score S; for p, pairs is the sum of the frequencies of S,—1 and 
S;+ 1 for p,+ 1 pairs. This obviously arises from the fact that if two members ranked equal, 
say rth, in a ranking with p,+ 1 pairs are subsequently distinguished and given rankings r 
and r + 1, this will increase the score by unity if the (r + 1) member falls after the (r) member 
when the ranking is arrayed against another ranking in the natural order 1, 2,3, ...,”, and 
reduce it by unity if the other member of the pair becomes the (r + 1)th; and these two possi- 
bilities complete the ways of forming a ranking with p, pairs from one with p, +1 pairs. 

This simple relationship, which may be written 


F(S;,.n, Pe) = f(S;+ l, n, Pet 1)+f(S;- 1, n, Pot 1), 
or taking another way of writing the basic distribution function 

P(S;, Pr Pe) ” (S;+ 1, Ss ee 2, Pat 1) + 9(S;— 1, ts 2, Pet 1), (1) 
p, being the number of members not in tied pairs, is of great assistance in tabulating the 
frequency distribution, and will be used below to establish the formula for the variance of S. 
It can be generalized to cover the effect of increasing the number of r-tuplets, when it becomes 

P(S;, P1> Pas +--+» Pras Pr) = o(S;—r- 1, Pi- 1, Pa +++) Pra 1, Prt 1) 

+$(S;—1r—3, py— 1, Pa, ---» Ppa 1, p+ 1) 


+9(S;+r—1, p,—1, pg, .--; Pra-l, p, +1). (2) 








Fre 
sho 
unc 
bak 
to | 


Th 
rel; 
of 


ha 


tk 











G. P. Smnuirro 39 


FREQUENCY AND PROBABILITY DISTRIBUTIONS OF THE SCORE S 
From a table of basic frequency distributions such as Table 1 the construction of a table 
showing the probability of attaining or exceeding an observed value of S by chance from an 
uncorrelated pair of rankings can obviously be constructed, and Table 2 shows such pro- 
babilities (positive S only, negative values obtainable by symmetry) for values of n up 
to 10, and all possible numbers p, of paired and-p, of triplet ties. 


THE VARIANCE OF S 
The variance of S when ties are present can be readily derived by using the recurrence 


relations given above and the value given by Kendall for the case of no ties. For the case 
of tied pairs consider 


at i intact +(S—1)?d(S—1, py—2, p+ 1) 
= S4P(S+1, py—2, pot I+ G(S—1, p,—2, pet 1} 
+2{(S+1)d(S+1, p,—2, pet 1)—(S—1)¢(S—1, p,—2, ppt V)} 
—{P(S+1, py—2, ppt 1)+A(S—1, py—2, p.+ IV}. 


If now both sides of this equation are summed over all values of S, the terms on the left- 
hand side become 


n! 
(aly 3 var o(S, py— 2, pet 1). 


The first terms on the right-hand side become by virtue of the recurrence relation (1) 


n! ¥ 
(a1. V8rO(S; Pas Po): 


the second vanishes through the symmetry of the distribution, while the third becomes 
2.n! 
~ (21)patt 
Hence there is obtained 
var $(S, p,—2, ppt 1) = varg(S, py, P2)— 





and so var f(S, n—2po, py) = var d(S,n)—p, 
_ n(n—1)(2n+5) 
fs 18 — 


using Kendall’s result. 


These results can also be generalized, using equation (2), to deal with multiplet ties, 
obtaining 


var (S, py—1,..., Ppa—1, pp+1) = varg(S, py, ---, Ppa» Py) — (7? — 1)/8, 





‘ n(n—1)(2n+5 3+8 3+8+4+...4+(r?-1 
and varg(S, py, Pe... p») =) Lm gga sit 


18 2 ey. alan S — P,;- 
It is obvious from these equations for multiplet ties that for any given number of ties of 


each multiplicity the variance will tend towards _ of the system without any ties as n 
increases. 





APPLICATION TO A PRACTICAL CASE 
The following results were obtained in a practical case in which two different tests were 
carried out on one each of a set of products. The problem is to determine the degree of 
relationship between the results of the two tests. It is also an instance of an occurrence 








40 Coefficient of rank correlation in rankings containing ties 


which arises at times in practice, in which some of the results are ‘off the scale’ of measure- 
ment with respect to one of the tests; these, twelve in all, have been given a tied ranking of 18. 


Test A 40-80 41-70 36°75 37-55 29-40 25-20 26°75 28-45 26°85 26-35 
Test B 1-5 1-5 1:5 2-5 3-5 10 2-5 2-5 2-5 6-5 
Test A 21-40 19-65 18-95 22-90 22-80 20-25 24-45 22-70 26-50 —_ 
Test B >10 >10 >10 >10 >10 >10 >10 >10 >10 — 


Test A 22-00 27-50 23-75 30-80 21-00 27:10 22-10 19-25 25-45 24-10 
Test B >10 3°5 1-5 2-5 7 6-5 7-5 9 3-5 >10 


Ranking the results according to their order in test A (from highest values to lowest) there 
is obtained 


Se ie Fee ee Oe: ee ee Ce oe ee ae ee ee 
SB ae OR ae a ae ee a a ee ee a8 
16 17 18 19 20 21 22 23 24 25 26 27 28 29 
18 18 1 18 18 18 16 18 18 15 18 18 17 18 
The lower ranking has 1 pair, 1 triplet, 1 quadruplet, 1 quintuplet and one 12-member 
multiplet. The maximum possible score is ox -1-3-6- 10-66 = 320 in such a 


: - sai 21 ‘ a ee te 
ranking. The actual score is + 212, giving r = oe =-0-6625. The variance of the distribution 





320 
of the scores obtained with such rankings in the case of no correlation is 
29.28.63 11 26 50 638 
1 -S-F-F— = 250988. 


Hence the probability of obtaining a score of 212 or more from an uncorrelated pair of 


such rankings corresponds to the probability of a normal variate attaining or exceeding 


21 ee 
Taser aS} == 4*158 times its standard deviation. 


REFERENCES 


KENDALL, M. G. (1938). A new measure of rank correlation. Biometrika, 30, 81. 

KENDALL, M. G. (1942). Partial rank correlation. Biometrika, 32, 277. 

KENDALL, M. G. (1943). The Advanced Theory of Statistics, 1, chapter 16, Rank Correlation. London: 
C. Griffin and Co. Ltd. 

KENDALL, M. G. (1945). The treatment of ties in ranking problems. Biometrika, 33, 239. 




















mn Fao eh i aang AR ee WOOD 


ee, See Sa ee a a | 























AAAAAARA AAAaann ar ee COO 


whore: 


aI sds} 2-353 515) 


+ bom 
Rom eee 





———————————— 








ecowoowvrwuwooovre DwWommmnmnwmnnwmnnw 


2 bo RO eee 




















whom: 








-_ oe: 





OO DD DD NS me ee 








0-19 
0-19 
0-18 
0-17 


0-23 
0-23 
0-23 
0-23 
0-22 


0-27 
0-27 
0-27 
0-27 
0-26 
0-26 
0-30 
0-30 
0-30 
0-30 


0-30 
0-30 


0-29 



















0-089 


0-084 
0-082 


0-076 


0-14 
0-13 
0-13 
0-13 
0-12 
0-18 
0-18 
0-17 
0-17 


0-17 
0-17 


0-21 
0-21 
0-21 





























Table 2. Probability that S attains or exceeds a specified value, in uncorrelated rankings. 





Values of S 





12 


15- 


16 


17 


18 


21 


22 





0-014 


0-011 
0-0,83 


0-047 


0-043 
0-042 


0-038 





| 0-089 
| 0-086 
| 0-083 
| 0-082 
| 0-077 
| 0-13 
| 0-13 
| O12 
| 0-12 


0-12 
0-12 




















0-10 | 


0-10 
| 0-10 


| o-10 | 
| 


0-0,14 





0-015 
0-013 


0-0,95 
0-0,71 


0-040 
0-037 
0-036 
0-034 
0-030 


0-073 
0-070 
0-070 
0-067 
0-064 
0-061 
0-11 

O11 | 


0-10 





0-091 
0-090 


0-088 
0-089 


0-087 
0-085 
0-082 





0-0,54 
0-024 


0-021 
0-019 
0-019 
0-017 


0-014 


0-047 
0-045 
0-044 
0-042 
0-040 
0-037 
0-078 
0-076 


0-075 


0-074 


0-072 
0-072 


| 0-070 











. | 0-0,24 
0-0,40 | 


0-0,16 
0-0,12 


0-016 
0-014 
0-012 
0-012 


0-098 


0-038 


| 0-036 


0-034 
0-034 


0-032 
0-031 


0-064 
0-063 


| 0-061 


0-062 


| 0-061 


| 0-059 
| 0-056 








6 











| 0-020 





| 0-0,93 
| 0-0,89 


0-029 
| 0-028 
0-026 ° 


0-027 
0-026 


| 0-025 


0-023 











24 








Nots. (a) Probabilities of S when no ties are present have already been given by Kendall (19: 
(6) Where many noughts occur between the decimal point and the first significant figu 


ated rankings. (Shown only for positive values ; negative values obtainable by symmetry) 


















































Values of S 
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 
| 
| 
Bios, 0-0,87 ‘ 0-0,20 : 0-0,25 
| 0-04.14 : 0-0,35 ; 0-0,50 ; 
; 0-0,60 : 00,99 : 
0-0,99 |° . 0-020 ; 
, 0-0,40 : 
0-0,89 : 0-0515 
t 0-0,30 : 
0-0,60 : 
) ; 
| 
= 0-0,63 : 0-0,29 , 0-0,12 ? 0-0,43 ; 0-0,12 : 0-0,25 . 0-0;28 
| 0-0,85 : 0-0,41 ; 0-0,18 ; 0-0,66 : 0-0,20 ; 0-0,44 ; 0-0,55 : 
me 0-0,57 : 00,25 . 0-0,99 : 0-0,32 ‘ 0-0,77 5 0-0,11 ; 
| 0-0,78 : 0-0,36 : 0-0,15 : 0-0,51 ‘ 0-0,13 ; 0-0,22 : 
oe 0-0,51 : 0-0,22 ; 0-0,79 : 0-0,22 : 0-0,44 : 
0-0,75 : 0-0,35 : 0-0,14 ; 0-0,46 ; 9-G,12 : 0-0,17 
) : 0-0,49 ; 0-0,20 : 0-0,73 | .- | 0-0,20 ; 0-0,33 ; 
0-0,67 : 0-0,30 ‘ 0-0,11 , 0-0,33 | - . 0-0,66 : 
3 ; 0-0,42 : 0-0,17 ; 0-0,53 : 0-013 : 
9 eet 0-0,41 ; 0-0,16 ; 0-0,50 : 0-0,99 
| 0-0,58 : 0-0,24 : 0-0,79 . 0-0,20 : 
0-0,48 0-0,18 0-0,60 ; 
0-023 0-014 , 0-0,83 ; 0-0,46 ; 0-0,23 a 0-0,11 , 0-0,47 ; 0-0,18 ; 
) ; 0-018 4 OOll |. 0-0,60 : 0-0,32 ; 0-0,15 : 0-0,68 : 0-027 ; 0-0,90 
0-022 : 0-014 aie 0-0,78 | - . 0-0,42 : 0-0,21 ; 0-0,96 : 0-0,39 : 0-0,14 : 
: 0-017 ‘ 0-010 : 0-0,55 : 0-0,29 j 0-0,14 : 0-0,58 : 0-0,21 : 0-0,66 
| 0-021 : 0-013 . | 0-0,73 : 0-0,38 : 0-0,19 : 0-0,83 ; 0-0,32 ; 0-0,11 : 
YS ee : 00,94. 0-0,51 : 0-0,26 ; 0-0,12 : 0-0,48 ‘ 0-0,17 : 0-0,44 
aye looi7 | 0-0,98 |. 0-0,54 : 0-0,27 - | 0-0,13 ; 0-0,54 ‘ 0-0;20 : 0-0,60 
| 0-021 | ; 0-013 ; 0-0,71 : 0-0,37 : 0-0,18 ‘ 0-0,79 ‘ 0-0,30 : 0-0,96 ; 
5 ; 0-016 ‘ 0-0,92 : 0-0,50 : 0-0,25 j 0-0,11 d 0-0,45 : 0-0,15 ; 0-0,40 
| 0-020 > 0-012 ; 0-0,66 . | 00,34 . 1 | 0-0,16 . | 0-0,66 ; 0-0,24 ; 0-0,66 ; 
| 0-020 | . | 0012 : 0-0,64 . | 0-0,33 ; 0-0,15 : 0-0,62' . 0-0,22 ; 0-0,60 : 
bis. (Ogn 3... 0-0,84 - | 0-0,44 ; 0-0,21 ; 0-0,91 : 03,34 ; 0-0,99 . 0-0,20 
| 0-019 | . 0-011 F 0-0,59 4 0-0,29 : 0:0,13 . 0-0,52 : 0-0,16 0-0,40 : 
3 * | 0-014 : 0-0,76 : 0-0,39 : 0-0,18 ; 0-0,71 : Crug 24 ‘ 0-0,60 ‘ 























by Kendall (1943) but are included here for completeness and in order to facilitate comparisons. 


b significant figure, an abbreviated notation has been used, e.g. 0-0,88 stands for 0-0000088. 














37 


38 


41 


42 


43 


45 


Ps 





0-0518 
0-0,14 
0-0,11 


0-0,96 


0-0,66 
0-0,60 


0-0,40 





























0-0,58 
0-0,41 
0-0,26 


0-0,23 
0-0,13 











0-0,25 
0-0,15 


0-0,88 
0-0,13 


0-0,66 








0-0,15 
0-0,88 
0-0,44 


0-0,33 














0-0,55 





0-0,28 














eoowoovvorrorerre DOmMmmwmDwmwmwmen 33 J 33 3 +1 +3 +) SAAADQAD aaana > > > wow ew 


* tom 


whom: 


> tom 


mwbhor-: 


-—: BOM: 


wnrrw- Ar wher: =~: Whom, BOR: 


*_ woe 





Ps 


no = Rom we ee 


C2 BORO ee RD 0D wee et 


WR ROLO mM eer 























Apy 





[ 41 ] 


THE USE OF RANGE IN PLACE OF STANDARD 
DEVIATION IN THE #¢-TEST 


By E. LORD, B.Sc., Shirley Institute, Didsbury, Manchester 


CONTENTS 

PAGE 
(i) Introduction 41 
(ii) The t-test 42 
(iii) The modified test (w- test) Sail on range 43 

(iv) Computation of percentage points of the distribution of = sat: ik: i.e. case 
with m= 1 aa 

(v) Computation of acsbiehane iiag? of the distribution of = Sul 2, n), i i.e. case 
with m = 2 . 49 
(vi) Computation of percentage pelt of the distribution of w= on n) din m> 2 52 
(vii) Approximate values of the percentage points of u . . , , ; 53 
(viii) Applications of the w-test 54 

Appendix. On the independence of mean and some pre atten: of shieniiail doviniion 
in random samples from a normal population . ° . . : ; 61 


(i) IvTRoDUCTION 


The difference between highest and lowest values has always ane recognized as a general 
indication of the variability of quantitative data. It was not, however, until 1925 that 
attention began to be focused upon the range as a useful statistical tool. In his paper ‘On 
the extreme individuals and the range in samples taken from a normal population’, Tippett 
(1925) obtained an expression for the mean value of the range in repeated random samples, 
and computed its value in terms of the population standard deviation for samples of size 
n = 2 ton = 1000. He also gave numerical approximations to the values of the moments 
of the range for fairly large samples. 

The work was taken up by E. 8. Pearson (1926), who determined numerically the exact 
values of the moments of the random sampling distribution of the range for small samples 
of size n <6, and also approximations to their values for samples of medium size. In a 
subsequent paper E. 8. Pearson (1932) tabulated the upper and lower percentage limits for 
the distribution of the range from frequency curves fitted with the values of the moment 
coefficients taken from both of the earlier papers cited above. 

The next advance was the determination of a general expression for the distribution of 
the range in samples of n random values from any population by McKay & Pearson (1933). 
For the normal population, only in the case of n + 3 was it found possible to obtain a fairly 
simple analytical form. (The distribution for n = 2 is, of course, well known, taking the 
form of the positive half of a normal curve.) 

Hartley (1942) later determined an expression for the probability integral of the range and, 
with Pearson (1942), tabulated this for the normal population for samples between n = 2 and 
n = 20. This latter paper also contains a table of several percentage limits of the range 
in samples from a normal population. These limits are derived from the numerical values of 
the probability integrals and replace the approximate values previously given by Pearson 
referred to above. 

Tippett (1925) and Pearson (1932) have pointed out that although the total range in a 
sample may be used for the purpose of estimating the population value of the standard 








42 Range in place of standard deviation in the t-test 


deviation, a more efficient measure may be obtained by dividing the sample into random 
subgroups of equal size and using the mean range of the subgroups in place of the total range. 
The éfficiency of range estimates of standard deviation is, of course, always less than that of 
root-mean-square estimates, but the work of Davies & Pearson (1934) and Pearson & Haines 
(1935) indicates that information is not discarded to any serious extent providing that the 
number of observations in the subsamples is not greatly in excess of about 10. 

As a result of the work outlined above, the range is now of considerable importance in 
many fields, especially in industrial quality control, where its simplicity has enabled it to 
be extensively and easily applied to the measurement of fluctuations in the variability of 
quality of a manufactured article or material. 

In the present paper an investigation is made cf the use of range estimates of standard 
deviation in the consideration of the statistical significance '“ deviations of sample means in 
normal random sampling theory. This use of range estimates of standard deviation is 
analogous to the use of root-mean-square estimates in the well-known t-test. Tables are 
given, at several probability levels, and these may be employed in determining the statistical 
significance of either the deviation of a sample mean from some fixed or hypothetical popula- 
tion value, or the difference between the means of two samples. These tables may also be 
used for obtaining rapid estimates of the accuracy of a sample mean from the variation 
within the sample as measured by the range. The use of range, in place of root-mean-square 
estimates of standard deviation, in this modified form of the t-test necessarily entails some 
loss of precision. It will, however, be shown in a future paper that this reduction in accuracy 
is negligible for all practical purposes. Furthermore, this slight disadvantage of the new test 
is compensated by its greater simplicity, involving a reduced amount of computing compared 
with the usual t-test. 

The range test is suitable for application to many problems frequently encountered in the 
treatment of various types of experimental data and in considering the mean character value 
in small samples in biological experiments. In the industrial field, the range test may be 
used for detecting changes in mean quality level, especially where the variation is not 
under strict statistical control or is subject to secular changes, or for determining whether 
the average level of a batch determined from a sample is in accordance with specification 
demands. A number of these problems are covered in the examples given at the end of the 


paper. 


(ii) THe t-TEst* 


In testing the significance of the deviation of a sample mean % from an assumed population 
value , use is made of the ratio |z—<| 
t= : Td ‘ (1) 





where N is the size of the sample and s is the root-mean-square estimate of the population 
standard deviation determined from the sample. In applying this rativ it is assumed that 


the N values form a random sample from a normal population of which the mean is &, 


standard deviation o and the distribution of values of x is given by 


1 _ (=—6) 
p(x) = J@mo° 20% . (2) 


* ‘Student’ (1908), R. A. Fisher (1925). 








@| 














E. Lorp 
More generally t may be defined as the ratio 
t = 2/s, (3) 


where x and s are statistically independent, x being a quantity distributed normally about 
a mean of zero and s a root-mean-square estimate based on v degrees of freedom of the 
standard error of x. Although the use of the tables of the probability integral of t enables 
the most efficient tests to be made of the various forms of the so-called ‘Student’s Hypo- 
thesis’, occasions frequently arise when more rapid tests are desirable, especially if accom- 
panied by only inappreciable loss of accuracy. The calculation of s, depending upon the 
squaring of numerical quantities, entails a certain amount of labour, especially if tables of 
squares or a calculating machine are not available. The use of the range, or the mean range 
determined from random subgroups in a sample, enables very rapid estimates to be made 
of the population value of the standard deviation o. In the following section these range 
estimates are used in place of root-mean-square estimates in a modified form of the t-test. 


(iii) THE MODIFIED TEST (w-TEST) BASED ON RANGE 


Here we replace the s of ‘Student’s’ ratio by an estimate of 7 based on the range. Thus 


x 
) = Sam, nd,’ * 


u= U(m,n 


where x is a quantity distributed normally about a mean of zero and w(m,n) is the mean 
value of m ranges w, obtained from m independent samples or subgroups, each containing 
n observations. The constant d,,, ina commonly used notation,* is the expected value of the 
range in samples of n, randomly selected from a normal population of unit standard devia- 
tion. The ratio v(m, n)/d,, is therefore an estimate of the standard error of x obtained from 
range and, as such, replaces the root-mean-square estimate s used in the ratio t = 2/s. 

Except for a few special cases, it has not been found possible to determine the analytical 
form of the distribution of u, but several tables of percentage points have been computed 
for use in testing the various statistical hypotheses normally covered by the t-test. The 
computation of these tables is considerably simplified by first determining the percentage 
points of the distribution of the subsidiary quantity 


u(m,n) ss -& 
a= q(m, n) = d,, 7 w(m,n)’ 





(5) 


and the multiplying by the corresponding value of d,, to obtain the percentage points of the 
u distribution. 

To simplify the algebraic expressions in what follows, w, w and q will be written for u(m, n),. 
w(m,n) and q(m,n) where no confusion is involved. 

The distribution of both uw and q are clearly independent of o. Hence, without any loss 
of generality, o may be taken equal to unity in considering the distributions. The distribution 
of x will therefore be defined by 


p(x) = — ei. (6) 


* See, for example, Pearson (1935), pp. 84 and 90. 


44 Range in place of standard deviation in the t-test 


Furthermore, let the distribution of the range w in a sample of n be y = p(w), that of @ 


be y = p(w), and that of g be y = p(g). Then since z and @ are defined to be statistically 
independent, we have the distribution of q given by 


pq) = [rw pee) dude, (7) 


where the integral is to be evaluated over the field of all values of x and @ subject to the 
relation (5) and to the conditions: 


—-O<2%<0, 0<W<oO. (8) 
Since z is distributed symmetrically about zero, and @ is independent of x, the ratio q is 
also symmetrically distributed about zero. Let q, be the value of g such that a is the chance 


that |q|>q,. The quantity « represents the total area of the two equal tails of the dis- 
tribution lying outside deviations + q,, and we have 


Te 
(1a) = 2" pq) da. (9) 
Alternatively, from (6), (7) and (8), this may be written in the form 


l-a= \ {pc es p(x) ac dw 


co de | . i 
n 2 {pe Jam" az a. (19) 

Except for a few cases in which the analytical form of the distribution of u has been 
obtained, equation (10) has been used to campute « alues of g, and, hence, the percentage 
points of w for values of « = 0-10, 0-05, 0-02, 0-01, 0-002 and 0-091, with values of n from 
2 to 20 and values of m from 1 to 10, 15, 20, 30, 60 and 120. 

The percentage points of u = u(m,) are first considered for the case when the estimate of 
standard deviation is based on the value of a single range of n random values (i.e. for m = 1). 
This treatment is followed by the case of m = 2, and finally consideration is given to the 
general case using estimates of standard deviation determined from the mean of m ranges 
each from an independent subgroup of n random values. 





(iv) COMPUTATION OF PERCENTAGE POINTS OF THE DISTRIBUTION OF 
u = u(1, n), i.e. CASE WITH m = 1 
Throughout this section the estimates of standard deviation are all based upon the value of 
the range in a single set of n random values of the variate (thus # = w). In the case of n = 2 
and m = 3 analytical solutions are derived for the distributions from which the percentage 
points of g and u are calculated. For n > 4, percentage points in the neighbourhood of those 
desired are determined by quadrature methods, and the required points obtained from these 
by interpolation. 
Special case n = 2, m = 1 


The distribution of ranges (w) in samples of two random values from @ normal population 
with unit standard deviation is the distribution of absolute differences between random pairs 
of variate values, and this may be easily shown to be 


p(w)dw = et dw (0<w<o). (11) 








Sinc 


the 





Oo 0 ®@ & 





E. Lorp 45 

Since z and w are independent, it follows from (6) and (11) that their joint probability 
distribution is 

p(w, x)dwda = a e~Ha*+40) danda. (12) 


72 


Transforming to new variebles, gq = ~/w and w and noting that the Jacobian of the trans- 
formation 

aa,w) _ 

oq,w) 


the joint distribution of q and w is given by 





p(q, w)dqdw = ape te wdgdw, (13) 
To obtain the distribution of g it is necessary to integrate (13) over the whole field of w, 


from 0 to oo. This gives 


p(q) dq = (14) 


a 
m2 (q?+ 4) 
Hence from (9) and (14) above, the percentage points of g are given by 


2 (% dg 
0-9) = Sal ean 


2 
=-— Mins 2 
tan“ (y2); 


Jgtan [5 (1 a)} = 2 5304). (15) 


The values of g, determined from (15), for the six values of a under consideration, are 
multiplied by d, = 2/,/7 to give the required percentage points of the distribution of 
a = (1, 2). 


and hence q,(1, 2) = 


Special case n = 3, m = 1 


For random samples of size n = 3 from a normal distribution with unit standard deviation, 
the distribution of the range has been found by McKay & Pearson (1933) and takes the form 


w/V6 
pe) = Soe eae 
Again, since w and z are independent, their joint distribution is given by 
3 w/V/6 
p(w, x) dwdx = ade eters diode |’ e~t* dz. (16) 


Transforming to new variables g = z/w and w, it follows from (16), since the Jacobian of 
the transformation is equal to w, that the joint distribution of g and w is given by 


wlV/6 
p(q, w)dqdw = a ewes wdwedg | e—t* dz. (17) 
n/m ° 








46 Range in place of standard deviation in the t-test 


To obtain the distribution of g the expression in (17) has to be integrated over the whole 
field of w, from 0 to oo. Thus 


(q) 3 it 4w%(q?+4) apd 42? q 
= —- a waw e Z. 
PD) = Fi), : 


Putting ¢ = z./6, the above may be written 


p(q) = =A6a)) “e-beed(put) [e-em dt 
0 0 


y=on 


_— ] w u 
= e~twXa?+4) } e- #12 at| 
TRG FD 0 w=0 


3 lees ‘ 
+ ——_——__—_—— e912 9—40XG* +8) dy, (18 
Tineen), 


The first expression in (18) is clearly zero-and hence 





3 ie} 
ee ee 
v0) = remy Ie us 








- eT eT a9) 
and therefore ss Pet ‘ a) | 
If tan 4 (i— z)| be denoted by 7, then 
qel1,3) = A. (20) 


The six required values of q, are found by substitution of the corresponding values of & in 


(20) above, and further multiplication by d, = 3/,/m gives the percentage limits of the 
distribution of wu = u(1, 3). 


General case n>4,m = 1 
For n>4 no suitable algebraic expression exists for the distribution of the range, but 
w 
Pearson & Hartley (1942) have tabulated values of the probability integral | p(w) dw to 
0 


4 figures at intervals of 0-05 of w for values of n from 2 to 20. Hartley kindly lent manuscript 
tebles of the integral tabulated to 5 figures at intervals of 0-25 of w for values of n from 
2 to 16. Using these five-figure tables for n = 4, 6, 10, 16 and the four-figure tables for n = 20; 
the frequency distribution of w was obtained numerically by subtraction of successive values 


of } p(w) dw at intervals of 0-25 and then converting these class frequencies into ordinates 
70 


y(w). The degree of approximation in the formula used implied the vanishing of fifth 
differences (see K. Pearson, T'ables for Statisticians and Biometricians, Part 11, p. Xvii). 








f 





ale 


0) 


he 





E. Lorp 47 


Each case was treated in turn, and the six values of the percentage points q,, corresponding 
to the six different values of « under consideration, were determined using the relations 
given in (10) above. Taking a trial value of ¢,, the integrals 


I(w,q,) = aos wm 6-42 de (21) 
es: V(27) Jo 


were calculated at intervals of 0-25 over the whole range of w. Quadrature was then applied 
to the products y(w) I(w,q,) over the range 0 <w<oo, to obtain the value of (1 —«) corre- 
sponding to the trial value of g,. This procedure was repeated a number of times to obtain 
values of (1 —«) corresponding to a series of equidistant values of q,. The required values of 
q,, corresponding to the six values of « under investigation were then obtained by backward 
interpolation. 

As n-> © the ratio w/d, tends to the population value of the standard deviation. Further- 
more, for n = 2 and n = 3, exact values of g, had been previously obtained by direct cal- 
culation for the six values of «. Thus it was possible to make initial estimates of the required 
values of q,, and the process of this ‘trial and error’ method was not found too laborious. 


Table 1. Framework values of percentage points of u = u(1,n) 

















tas 0-10 0-05 0-02 0-01 0-002 0-001 
2 5-0376 10-1381- 25-389 50-791 253-97 507-95- 

3 2-5935- 3-8225+ 6-188 8-819 19-84 28-08 
4 2-1793 2-9505+ 4-213 5-420 9-42 11-75+ 

6 1-9354 2-4755+ 3-249 3-900 5-71 6-66 

10 1-8064 2-2390 2-807 | 3-244 4-32 4-82 

16 1-7496 2-1385+ 2-628 2-990 3-82 4-19 

20 (a) 1-7320 2-1083 2-576 2-916 3-69 4-01 

20 (b) 1-7314 2-1074 2-576 2-916 3-69 4-02 

| 16449 1-9600 2-326 2-576 3-09 3-29 


























The framework values of the percentage points of u(1,”) were obtained by multiplying 
the values of g, by the corresponding values of the mean range d,, tabulated by Tippett 
(1925) and are given in Table 1, together with the exact values for n = 2 and n= 3 
determined above. 

As a check on the accuracy of determination of these percentage points, the six values 
for n = 20 were also calculated by a second method. Writing 


r= l/q =d,/z, (22) 


the method is to determine, for the six values of a, the corresponding values of g, such that 


oo 


a= Sie p(r) dr + [ p(r) dr, (23) 


Lady 
where y = p(r) denotes the frequency distribution of r. Since q is distributed symmetrically 
about zero, its reciprocal r is also distributed symmetrically about zero, and from (22) 
and (23) it follows that f° 


ll 
~) 


a p(r)dr 
14x 
CZAx 


2 | “nte) dx , p(w) dw. (24) 


70 


II 








48 Range in place of standard deviation in the t-test 


Ordinates of the normal curve y(x) = _| ew were taken at intervals of 0-25 for x in 


v(27) 
the range 0<2<oo from K. Pearson’s Tables for Statisticians and Biometricians, Part 1. 


Z/@q 
Taking a trial value of 1/q,, the integrals | p(w) dw were calculated from Hartley’s four- 
0 
figure tables for each value of x in the above range. By quadrature applied to the products 
2d 
pte) | p(w) dw the integral (24) was evaluated. By taking a series of equidistant values of 
0 


1/q, other trial values of « were determined. Backward interpolation was then used to 
obtain the required values of g, corresponding to the six values of a under consideration. 
Finally, the six percentage points of uw were determined by multiplying the values of ¢, by 
dy given in Tippett’s table. These percentage points are given in the penultimate line (5) 
of Table 1, and comparison with the corresponding values in the line above, (a), indicates 
good agreement between the two methods of computation. 

Since the percentage points of w for n = 4, 6, 10 and 16 have been determined using 
Hartley’s five-figure manuscript tables of the cumulative frequency distribution of w, they 
should be at least as accurate as the percentage points for n = 20 determined from the 
four-figure tables. 

Changes in the percentage points at the different levels of significance run most smoothly 
if arguments proportional to 1/n are used in place of n, and reciprocals of u for the variate. 
Using a six-point general Lagrangian formula applied to the points corresponding to 
n = 3, 4, 6, 10, 16 and 20, values of percentage points of u were determined for n = 5,7, 8, 14 
and 18. (In the case of nm = 20 the mean values of the percentage points determined by the 
two methods were used.) The interval was then halved, using a nine-point Lagrangian 
through points corresponding to n = 4, 6, 8, 10, 12, 14, 16, 18 and 20. Finally, the six sets 
of percentage points were differenced as a check, reduced by either one or two figures and, with 
the exception of those for n = 5, are given in the second columns of Tables 3-8 under m = 1. 

For n = 5, the six percentage points of uw were independently determined at a later stage 
of the investigation by the method used above for n = 4, 6, 10, 16 and 20, and it is these 
values which are given in Tables 3-8. In the table below, the values obtained by inter- 


polation from the framework values are compared with those determined by direct 
calculation. 


Percentage points of u(1, 5) 








a= 0-10 | 0-05 0-02 0-01 0-002 0-001 
By direct calculation 2-019 2-635" 3-56 4-38 6-8 8-2 
By interpolation 2-020 2-635+ 3-56 4-38 6-8 8-2 





























In one case only, for « = 0-10, is there a difference as much as one unit in the last figure, 
the actual values obtained being 2-0192 by direct computation and 2-0198 by interpolation. 

Taking all the various checks into consideration, it appears unlikely that the values of 
the percentage points of u(1,) given in the tables at the end of the paper are in error by 
more than one unit in the last place. The values for n = 2 and n = 8 are, of course, exact. 








the 


W 
be 
2u 








E. Lorp 49 


(v) COMPUTATION OF PERCENTAGE POINTS OF THE DISTRIBUTION OF 
u = u(2, n), i.e. CASE WITH m = 2 
The exact distribution of u(2,”) cannot be determined analytically except in the case of 


nm = 2, and hence the various percentage points have necessarily to be evaluated almost 
wholly by numerical methods. 


The probability of the joint occurrence of a pair of ranges w’ and w” from random samples 
of equal size n from a normal population of unit standard deviation is given by 


p(w’, w") dw’ dw" = p(w’) p(w") dw’ dw”. (25) 
If W be the mean value of the two ranges, then its distribution is obtained by integrating (25): 


p(w) dw = [p(w') p(w" )dw' de, (26) 


the integration being taken over the whole field of w’ and w” subject to the conditions 
W=}(w'+w") (O<w'<a, 0<w" <o). (27) 


We shall change the variables from w’ and w” to w and w’, the Jacobian of the transformation 
being equal to 2. With these new variables, and noting from (27) that w’ varies from 0 to 
2w, equation (26) gives 


p(w) = 2 | ° pw’) p(2m—wydw’ (28) 
0 
Special case n = 2,m = 2 


In equation (11) above is given the distribution of the range in samples of two random 
values from a normal population with unit standard deviation. Substituting this in (28) 
above, the distribution of means of two independent values of w is therefore given by 


(w) = 2 ws entw? | oye" dw’ 
e o Va V7 


2 eee en 
= — ee | e—hw’—u)? day’ 
7 0 





é  e TT eae paliils (29) 
J(2n) 0 (2m) ; 


® 
ane oA oe —}2? 
I(w) [ Jian’ dz, 


the expression (29) for the distribution of the mean range may be written in the form 


Using the notation 


p(w) = Jom ttm. (30) 


We may now proceed to determine the distribution of the ratio g = z/w. Since they are 
independent, the joint distribution x and w is, from (6) and (30), given by 


p(x, w) daxdw = Sete et (wv) dadw. (31) 


Biometrika 34 4 








50 Range in place of standard deviation in the t-test 


Transforming to variables g and w, noting that the Jacobian of the transformation is equal 
to w, and integrating, we obtain 








p(q) = =| e-teXP+) 7 () wd. (32) 
0 
: dI(w) 1 om 
Now since eee On)° ’ 
we make use of the identity 
ogra 1()} = — W(q? + 1) e-H**+D T() + per et e—b62(q?-+1) 
dw : (27) 


to evaluate the integral in (32) and obtain 


4 ae . 
te 721) —fu? p—bu(P+D) dip 
10) = Tange), eee 


2 


s 
- 











= ; 33 
m(q? + 1) J(g?+ 2) ats 
The ratio q is distributed symmetrically about zero and from (9) and (33) we obtain 
Ge 
l-a= =| 2 “4 2 ) 
mo (+1) ¥(q?+2) 
and the percentage points are therefore given by 
7/2 
= os 34 
a = al2,2) = 7 (34) 
where 7 = tan (F (1- x)| : 


Substitution of « = 0-10, 0-05, etc., in (34) gives the required values of g,, and further 


multiplication by d, = 2/,/7 gives the required corresponding percentage points of the 
distribution of u(2, 2). 


General case n>3, m = 2 


For n = 4, 6, 10, 16 and 20, the ordinates y(w), of the distribution of the range have been 
previously evaluated above at intervals of 0-25 for w, and these are used in place of the 
unknown p(w). Taking a particular value of #, quadrature was then applied to the products 
y(w) y(2w — w) to obtain a numerical estimate of p(w) from equation (28). This process was 
repeated at intervals of 0-25 for # through as much of the range 0 < W < 00 as was necessary 
to obtain the required degree of accuracy. For n = 3, determination of the ordinates of the 


distribution of # was also based on quadrature, but, in this case, exact figures for the’ 


ordinates of the distribution of the range were obtained from its equation found by McKay 
& Pearson (1933). Because of the rapid rise of the distribution near to the origin, estimates 
of the lower values of p(w) in this case were determined using an interval of 4, for w in order 
to obtain the requisite accuracy. For higher values of # the interval was progressively 
decreased to 0-25 used over the tail portion of the curve: 








were 
for 1 


P 
F 
d 
€ 





ial 


2) 


3) 


E. Lorp 51 


Treating in turn each curve of the distribution of @ (for n = 3, 4, 6, 10, 16 and 20), the 
values of the percentage points of the distribution of the ratio 
TS" s 


were computed by a method similar to that used in evaluating the percentage points of ¢ 
for the case m = 1. Taking trial values of q,, the integrals 


I(W,¢ ) = Jan | de 
*— A(2m) Jo 


were calculated at intervals of 0-25 for w. Quadrature was then applied to the products 
y(w) I(W,q,) over the range 0 << oo to obtain corresponding values of (1—«), the sum of 
the two tails of the distribution beyond deviations +q,. Repeating this procedure, a series 
of values of (1 — a) were obtained, corresponding to a set of equidistant values of q,. Back- 
ward interpolation was then used to obtain the six values of g, corresponding to the six 
values of « under consideration. Finally, the required nercentage points of u were obtained 


Table 2. Framework values of percentage points of u = u(2,n) 








= 
Ee 0-10 0-05 0-02 0-01 0-002 0-001 
2 2-6203 3-8671 6-266 8-932 20-10 28-47 
3 2-0201 2-6365+ 3°555+ 4-340 6-78 8-66 
4 1-8760 2-3672 3-047 3-600 5-07 5-80 
6 1-7791 2-1920 2-727 3-133 ‘4-11 4-52 
10 1-7225+ 2-0926 2-551 2-884 3°64 3°96 
16 1-6961 2-0470 2-472 2-775+ 3-44 3-71 
20 1-6879 2-0329 2-449 2-742 3°38 3-64 
































by multiplication by the appropriate value of d,,, and are given in Table 2, together with 
the exact values for n = 2 determined from (34). 

As before, Lagrangian formulae were used to interpolate the intermediate values of the 
percentage points of u, again taking arguments proportional to 1/n and reciprocals of the 
percentage points for the variate. As a check, the values were inspected by determining 
differences up to the third order and then reduced by one place of decimals. With the 
exception of the values for n = 5, the reduced values are given in Tables 3-8. 

For n = 5 the six percentage points of wu were independently determined at a later stage 
of the investigation by the same methods used for the framework values. These directly 
computed values are given in the tables mentioned above, and are also reproduced below 
for comparison with those obtained by interpolation from the framework values. 


Percentage points of u = u(2, 5) 











| : 

| a=| 010 | 0-05 | 0-02 0-01 0-002 | 0-001 
| 

| By direct calculation 1-814 2-254 2-84 3-29 4-4 | 5-0 
| By interpolation 1-814 2-254 2-84 3-29 44) 49 














52 Range in place of standard deviation in the t-test 


In the case of « = 0-001, direct calculation gives a value of 4-97 compared with 4-92 
obtained by interpolation. For other percentage levels the agreement is exact to the number 
of figures quoted. 


(vi) COMPUTATION OF PERCENTAGE POINTS OF THE DISTRIBUTION OF 
u=u(m, n) FOR m > 2 


The variance of the mean range in m subgroups of equal size n steadily decreases as m 
increases, and the ratio @(m,n)/d, gives closer estimates of the population value of the 
standard deviation of the variate. Hence, following the usual methods of large-sample theory, 
the limiting values of the percentage points of u, for indefinitely large m, may be determined 
from integral tables of the normal curves. For a given value of a, the limiting values of the 
percentage points are, of course, equal for all values of n, and arealso equal to the corresponding 
limiting values of the percentage points of Fisher’s t-distribution for an indefinitely large 
number of degrees of freedom. 

In general it was found for a particular value of « and of n that a three-point Lagrangian 
curve with 1/m as argument and reciprocals of the percentage points of wu as variate (passing 
through points corresponding to m = 1, 2 and oo) may be used for interpolation of the 
required percentage points corresponding to values of m intermediate between 2 and oo. 
Only in the case of n = 2 and = 3 was the required accuracy not attained by this procedure 
and further investigation found necessary. Details of the methods used are given below. 

In the case of n = 2, the percentage points of the distribution of u(m, 2) were also deter- 
mined for m = 4 and m = 8 as follows. First, considering m = 4, it was necessary to obtain 
numerical estimates of the ordinates of the distribution of the means of four ranges, each 
range from a random sample of two values from a normal population with unit standard 
deviation. Following the method used for m = 2 and leading to equation (28), it is easy to 
show that the distribution of w(2m,n), the mean of 2m independent ranges each from a 
random subsample of n values, is given in terms of the distribution of the mean, w(m, n), of 
m such ranges by 


2 (2m, n) 
p(w(2m, n)) = 2 p(w(m, n)) p(2w(2m, n) — w(m, n)) dw(m, n). (35) 


Using numerical values for p(w) (m = 2, n = 2) given by equation (29) above, estimates 
of the ordinates of the distribution of p(w) form = 4, = 2 at intervals of 0-25 were found by 
quadrature methods similar to those described in previous sections by using the above 
expression. A repetition of this process, using these last computed values, yielded numerical 
estimates of the distribution of the means of eight ranges, i.e. m = 8. Again applying quad- 
rature to the two distributions, values of (1—«) were determined for a series of equidistant 
values of q, (m = 4 and m = 8). The required values of ¢,, corresponding to the six values 


of a between 0-10 and 0-001 under consideration, were then obtained by backward inter- 


polation, and hence the percentage points of w(4, 2) and u(8, 2). 

The sets of six percentage points of u(m, 2) were determined for each required value of m 
by Lagrangian interpolation, reciprocals of m being used as argument and reciprocals of 
the corresponding percentage points as variate, the curve passing through the points corre- 
sponding to m = 1, 2, 4, 8 and oo. The interpolated values of percentage points obtained by 
this method, and the directly computed values for m = 4 and m = 8, are given in Tables 3-8 














E. Lorp 53 


for a series of values of m suitable for practicai use. As a check upon the method, the com- 
putation was repeated, this time using a four-point Lagrangian passing through points 
corresponding to m = 1, 2, 4 and oo. Most of these interpolated four-point values of the 
percentage points agree exactly with the five-point values previously obtained. In cases 
where differences arise, none exceed 1 unit in the last figure. The five-point Lagrangian 
method of interpolation therefore certainly appears to be quite adequate for furnishing the 
required degree of accuracy. 

Numerical estimates of the distribution of the means of pairs of ranges from subsamples 
of size n = 3 and n = 4 have already been obtained above. Using these in turn in equation 
(35), estimates of the ordinates of the distributions of the means of four ranges were deter- 
mined at intervals of 0-25 by quadrature methods. The sets of percentage points of u(4, 3) 
and u(4,4) were then computed by the previous method of trial values and subsequent 
backward interpolation. 

For n = 3 and n = 4, the percentage points of the distribution of wu, for given values of m, 
are lower and nearer to their limiting values than the corresponding points for n = 2. 
Furthermore, the changes in the values of the percentage points for small values of m are 
also less abrupt. In view of the agreement between the four-point and five-point Lagrangian 
interpolated values of the percentage points for n = 2, a four-point Lagrangian through 
points corresponding to m = 1, 2,4 and oo may certainly be relied upon to give adequate 
accuracy for the interpolation of percentage points corresponding to intermediate values 
of m in the case of n = 3 and n ='4. These values, together with the computed values for 
m = 4, are given in Tables 3-8. 

For the remaining values of n, from 5 to 20, the interpolated percentage points given in 
Tables 3-8 have been obtained by means of a three-point Lagrangian curve, using values 
of the percentage points corresponding to values of m = 1, 2 and oo. As in the previous cases 
of interpolation, reciprocals of m and the variate were used in order to obtain small changes 
in successive differences. To show that this method is adequate, the six sets of percentage 
points for n = 4 were also interpolated using a three-point Lagrangian. In every case except 
one, these values agreed exactly with the four-point Lagrangian interpolated values pre- 
viously found and given in Tables 3-8. In the case of the sole exception, the difference 
between the two interpolated values was only one unit in the last figure. For the less rapidly 
changing values of the percentage points of u for n> 5, the three-point method of inter- 
polation therefore provides sufficient accuracy for the present purpose. 

Taking all checks into consideration it appears that the tabulated values of the per- 
centage points of the distribution of the function u = u(m,n) may be relied upon to the 
accuracy given: occasionally the values may be one unit in error in the last figure. 

In Tables 3 and 4, the 10% and 5% points of w were computed to 3 decimal places, 
but lack of space has necessitated these being curtailed for publication. For the same 
reason the values of the percentage points of u for the odd values of n = 11, 13... 19 have 
been omitted. In practical applications of the test, it is not considered that this reduction 
will cause any undue inconvenience. Fuller tables have, however, been retained for 
forthcoming work on the power of the w-test and are available for consultation if required. 


(vii) APPROXIMATE VALUES OF THE PERCENTAGE POINTS OF & 


If there are m subgroups each of n values, and if the estimate of standard deviation is 
determined as the root-mean-square of the deviations of variate values from the respective 





54 Range in place of standard deviation in the t-test 


means of the subgroups, then the number of degrees of freedom is vy = m(n—1). Unlike the 
usual t-test, when the estimate of standard deviation is determined from the mean range in 
m subgroups of equal size n, the percentage points of the modified t-distribution investigated 
above depend upon the relation between m and n. Reference to Tables 3-8 indicates that, 
for a constant number of degrees of freedom v = m(n — 1), the values of the percentage points 
on a given probability level vary slightly as m and n vary. For example, taking « = 0-05 
and v = 8, we have the following percentage points: 2-272 for m = 1 and n = 9, 2-254 for 
m = 2 and n = 5, 2-250 for m = 4 and n = 3, and 2-264 for m = 8 and n = 2. In general, 
however, the range in the values of the percentage points of u for a given value of v is small, 
and this permits the construction of a table giving approximate values of the six sets of 
percentage points corresponding to different numbers of degrees of freedom. 


Approximate values of percentage points of u 





Values of « 
Degrees of Pe 


freedom 
v= m(n—1) 





0-10 0-05 0-02 0-01 0-002 0-001 





OCoIMPDA wwe 
a 
MOS 
Hr © 
bo 

on 
eS 
} 

a 

° 

~J] 

© 


bo tp tp tp by ty ty to Wy ty ty ty & & 


to ta ty tp to fo no to Dy 
SP SCOOP EE EEE RE NNN NNWWRAROD&= 


- 
° 

AAA AATAIADPBSDBBBNBNOOONSS 
DyLPdyy YH YY NNN NH OWwOwOhaen 
bp eRABAOGHSIITABDBSOOHWANNSE 
DWN NNW wWwWwwwowwwwh 
SIIDBSDSOSOOCOH HER HE NWABDO 

09 09 69 G9 69 09 6 9 62 CO CO I HH OD 

DM hAaEDAIAIDSCOHNWADWH IBS 

2 9 CO CO He he oe oe HH OUT UD 
DPONEDOOH HH NWWNADOMO=109 


9) 1-64 


_ 
oO 
bh 
w 
w 
to 
ao 
@ 
wo 
S 
vo 
wo 
to 
© 



































For a particular pair of values of m and n, the values of the percentage points for 
v = m(n—1) degrees of freedom given in the table above are generally not in error by more 
than one unit in the last place of figures. This degree of accuracy is frequently sufficient for 
many practical applications of the distribution of wu. To settle the significance of cases giving 


values of u close to the above approximate values, reference should be made to the accurate 
values given in Tables 3-8. 


(viii) APPLICATIONS OF THE U-TEST 
The difference between the mean of a sample of n random values of a normally distributed 
variate and the population value is shown in the Appendix to be independent of the total 








~ =~ == © 2 eet oe’ @ 











E. Lorp 55 


range in the sample, and also independent of the mean range determined from random sub- 
groups of values. The modified t-test based on range estimates of standard deviation may 
therefore be used in various statistical tests of significance involving deviations of sample 
means. The application of this range test to sampling problems is analogous to that of the 
well-known /-test, and no detailed description is therefore required. The most frequent use 
of the new test will be found in the treatment of experimental data of various types, and 
also in the examination of test results recorded for the purpose of control of the quality of 
industrial products. In this latter type of work, cases frequently arise when it is desirable to 
apply a rapid test for determining the significance of a difference between the mean of a 
sample and some preassigned value, frequently some desired control level, or the significance 
of the difference between two sample means. Furthermore, for routine purposes, it is often 
desirable that the test should not only be rapid but also of a simple nature, thus enabling it 
to be used by workers with little mathematical or even arithmetical aptitude. The new range 
test has the advantages of greater simplicity and greatly reduced amount of computing 
compared with the standard t-test. The use of range estimates of standard deviation, in 
place of root-mean-square estimates, necessarily entails some loss of precision, but in a future 
paper it will be shown that this reduction in accuracy is small and certainly negligible for 
most practical purposes. 

The most frequent applications of the range test are considered below and are followed 
by several numerical examples in which, for purposes of comparison, the parallel treatment 
by the t-test is also given. As in the t-test, the application of the range test involves the 
assumption of normality of variate distribution and randomness of sampling. Furthermore, 
where the standard deviation is estimated from the mean range of several subgroups of 
values, care should be taken to ensure that the arrangement of these values is also random. 
This latter condition is usually fulfilled by considering the values in the order in which they 
were originally recorded. In a few cases, however, the order of recording may not be random; 
the particular circumstances of a test may be such that the order of the observations may be 
wholly or partly dependent upon their magnitude. In such cases a set of values can be 


divided into random subgroups by the use of tables of random sampling numbers or by 
other means. 


(a) Difference between sample mean and population mean 


Suppose we have some preassigned value £, and wish to test whether the mean Z of a 
sample of N values may be considered as a reasonable estimate of £, or whether the difference 
between Z and £ is real in the statistical sense. The usual assumption, the so-called ‘Student’s 
Hypothesis’, is made that Z is the mean of a random sample from a normal population of 
which the mean is — and standard deviation is o. The differences (%— £) will be distributed 
about a mean of zero with a standard error equal to o/,/N. If the sample be divided into m 
random subgroups of equal sizen, N = mn, and Wis the mean of the m ranges of the subgroups, 
then the sample estimate of the standard error of the mean is w/(d,,./N). The ratio of the 
difference between the means to the estimate of its standard error is 


u= |z—E|dn JN 


w 


(36) 


If the computed value of u exceeds the corresponding percentage point in one of Tables 
3-8, then the difference is considered unlikely to have arisen through random sampling on 








56 Range in place of standard deviation in the t-test 

that particular probability level a. As in the case of the t-test, when considering the asym- 
metrical case of ‘Student’s Hypothesis’, the values of « at the headings of the tables should 
be halved. 

For fairly small values of NV, the estimate of the standard error of the mean may be deter- 
mined, not from the mean range in subgroups, but from the total range between the maximum 
and minimum values in the sample. In the notation used above, this corresponds to m = 1 
and n = N. The test of the significance of the difference may be made as above and the 
computed value of u compared with the percentage points in Tables 3-8. In these cases, 
however, the computation may be curtailed by using the ratio 


6 u(l,n) 
w a d,, /n ? (37) 





where |%—£| = 6, and w is the range in the undivided sample. Table 9 gives values of the 
ratio 6/w for various levels of significance corresponding to the sum of the two tails of the 
distribution. For a chosen level of significance the difference 5 is considered too large to 
have arisen through random sampling errors if the value of 3/w exceeds the corresponding 
tabulated value. Table 9 will also be found useful for giving a rapid estimate of the accuracy 
of the mean based on a small number of observations. 


(b) Difference between two sample means 


Suppose the first sample of size N, be divided into m, random subgroups of size n, and the 
second sample of size N, be divided into m, random subgroups also of size n, i.e. 
n = N,/m, = N,/m,. 
‘The hypothesis is made that each sample can be considered as a random selection from the 
same normal population. Let the numerical value of the difference between the two sample 
means be | Z,—%,|, and the mean of the (m,+mz,) ranges of n values be w = @(m,+mz,n), 
giving an estimate #/d, for the standard deviation of the variate. The ratio of the difference 
between the two sample means to the range estimate of the standard error of the difference is 
pore | xy tik Xe | d,, 
Y= ae ICLIN, + 1/4) te 
The significance of the difference between the means in any particular case can be deter- 
mined by noting whether the computed value of u exceeds the corresponding percentage 
point for a chosen value of « by reference to Tables 3-8, using the column headed m = m, + mg. 
When the samples are small! and of equal size, say n, the variate standard deviation can 


be estimated from the two total ranges in the samples. If w’ and w” are the two ranges, with 
a mean value # = }(w’ + w”), then 





* = | 2 — %_|dn V(4n) 


Ww 





(39) 


may be used as above for testing the significance of the difference between the two means. 


A more rapid test may, however, be made by simply determining the value of the ratio of 
the difference between sample means to the average of the two sample ranges 


| Z,—Z, | _ u(2,n) 





iw’ +w") ~ d, Jn)’ 


In Table 10 are given values of the above ratio lying on six different probability levels. For 











oo -_ 


Ss Ve OBO OO @ 





E. Lorp 57 


a given level of significance a, values of the ratio smaller than those tabulated may be con- 
sidered to have arisen through random sampling errors; greater values indicate that a given 
difference is unlikely to have arisen through chance and therefore point to a real difference. 

In the computation of u it is necessary to use values of d,,, the mean range in samples from 
a normal population of unit standard deviation. A selection of the values determined by 
Tippett (1925) is reproduced in Table 11 to avoid the necessity of frequent reference to his 
original paper, and is accompanied by the corresponding values of ,/n and d,, ,/n. 


(c) Confidence intervals 


As with ‘Student’s’ test, the tables of percentage points may be used to estimate with 
a given measure of confidence, the interval within which it can be stated that £ or £, — &, lies. 


Examples 


Example 1. The following data have been previously used as an example by ‘Student’ 
(1908). Ten patients were treated with the optical isomers of hyoscyamine hydrobromide 
and the additional hours of sleep were noted. 


Additional hours sleep gained by use of hyoscyamine hydrobromide 








Patient Dextro-(D) Laevo-(L) Difference (D— L) 
1 +0°7 +1-9 +1-2 
2 —1-6 +0°8 + 2-4 
3 —0-2 +1-1 +13 
4 —1-2 +0-1 +13 
5 -—0-1 —0O-1 0-0 
6 +3:-4 +44 +1-0 
7 +3°7 +55 +1-8 
8 +0°8 +16 +0°8 
9 0-0 +46 +46 
10 +2-0 +34 +1:4 
Means + 0:75 + 2°33 + 1-58 




















The last column may be used for the controlled comparison of the two drugs, since their 
effects were measured on the same ten patients. The laevo form has given a greater figure for 
the additional hours sleep than the dextro form. Whether the former may be considered as 
the better soporific is examined by both the standard deviation and range tests. 

(a) The sum of squares of deviations of the differences about their mean value is 13-616, 
associated with 9 degrees of freedom. The estimate of the standard error is therefore 0-3890, 
and the value of ¢ works out to be 1-58/0-3890 = 4-06. For 9 degrees of freedom a value of 
t = 3-250 lies on the 1 % level of significance. Assuming normal random sampling, a value 
of t equal or greater than 4-06 will occur much less frequently than once in a hundred times. 
This leads to the conclusion that the laevo form is better for producing sleep than the deztro 
form. pis 

(6) For examination by the range, the value of u = u(1, 10) may be computed, but in this 
case it is simpler to use the shortened method of equation (37). The ratio of the mean differ- 
ence to the range in the ten individual differences is 6/w = 1-58/4-6 = 0-34. Reference to 
Table 9 shows that this value is slightly in excess of the tabulated value 0-333 on the 1 % 
level of significance, leading to the same conclusion as that drawn from the t-test. 








58 Range in place of standard deviation in the t-test 


The greater significance suggested by the t-test seems to be largely due to the exceptional 
difference D— L for Patient No. 9, viz. 4-6, which affects s more seriously than w. 

Example 2. In the calibration of a viscometer it is necessary to time the interval required 
for the level of an aqueous solution of glycerol to fall between two fixed marks. For satis- 
factory calibration it is considered desirable that the mean time of flow should be accurate 
to + }sec., risking a greater error not more frequently than 1 in 20 times. Five independent 
determinations of the time interval (in seconds) for one viscometer were 103-5, 104-1, 102-7, 
103-2 and 102-6. While this number of observations is clearly too small for a final assessment 
of accuracy, it is often useful to get an interim answer to guide further action. 

(a) The sum of squares of the deviations of the five observations about their mean is 
1-508, associated with 4 degrees of freedom, giving an estimate of the standard error of the 
mean equal to 0-275. Reference to tables shows that a value of ¢ equal to 2-776 lies on the 
5 % level of significance. Hence in 19 times out of 20 it would be expected that a sample mean 
will not diverge from the true mean value by more than + 2-776 x 0-275 = + 0-76 sec. This 
error exceeds the assigned limits of + 4 sec. and therefore points to the necessity of further 
tests to fulfil the required conditions. 

(6) Instead of computing an estimate of the standard error of the mean from the range 
(w = 104-1 — 102-6 = 1-5) in the five determinations, we note from Table 9 that a value of 
6/w = 0-507 lies on the 5 % level of significance. Hence in 19 times out of 20 the sample 
mean will differ from its true value by an amount up to a deviation of 

+3 = + 0-507 1-5 = +0-76, 
a result in agreement with that yielded by the t-test. 

Example 3. In the processing of raw cotton, modifications were made in the design of one 
of the machines with the object of improving the efficiency of cleaning. Tests were made 
on a series of 24 different mixings for the purpose of determining whether yarn strength was 
adversely affected by the mechanical alterations. The results of the 24 pairs of comparisons 
are given below (the strength being expressed as a count x strength product), together with 


the differences between them expressed as percentages of the corresponding strengths under 
standard conditions. 


Yarn strengths under standard and modified conditions 














Strength Strength 
Percentage Percentage 
difference difference 
Standard Modified 100(M — S)/S Standard Modified 100(M — S)/S 
S M S M 

1805 1763 —2-3 1931 1898 —1:7 
1870 1901 +1:7 1508 1520 +0°8 
2000 2026 +1-3 2111 2119 +0-4 
1823 1904 + 4-4 1496 1481 —1-0 
1603 1619 +1-0 1672 1723 +3-1 
1889 1830 —3-1 1947 1759 —9-7 
2058 2019 —1-9 1960 1934 —1:3 
1806 1850 + 2-4 1624 1594 —1:8 
1056 1112 +53 2162 2170 +0-4 
1857 1782 —4-0 1915 1967 +2-7 
1801 1720 —4-5 1738 1810 +41 
2094 2144 + 2-4 1609 1613 +0-2 









































E. Lorp 59 


The mean value for the percentage difference in strength is —0-46. Whether this is an 
indication that the mechanical modifications have resulted in the production of weaker 
yarns is examined by means of the standard deviation and range tests. 

(a) The sum of squares of the deviations of the percentage differences about their mean 
value is 256-48, based on 23 degrees of freedom. The estimate of the standard deviation of 
the percentage differences is 3-34 and the standard error of their mean value is 0-68, giving 
a value of ¢t = 0-46/0-68 = 0-59. This is much beiow the value of 2-069 on the 5 % level of 
significance and leads to the conclusion that there are no grounds for suspecting that the 
mechanical alterations have led to the production of weaker yarns. 

(6) The number of observations place this case outside the range of Table 9, and it is 
therefore necessary to use the modified t-function. The data are arranged in random order 
of their occurrence, and split into four groups of six, The ranges in the sets of six differences 
are 7-5, 9-8, 12-8 and 5-9 with a mean value (4, 6) = 9-0. The estimate of the variate standard 
deviation is (4, 6)/d, = 3-55, giving 0-72 for the standard error of the mean percentage 
difference, and u = 0-46/0-72 = 0-64. The 5 % level of significarce is, from Table 4, equal to 
2-07, much greater than the vaiue computed from the data and therefore indicates the same 
conclusion as above. 

Example 4. Independent determinations of percentage trash content were made in tri- 
plicate on two samples of raw cotton and the following results obtained: 


Percentage trash content of raw cotton 








Sample A Sample B 
1-13 0-76 
1-31 0-64 
1-25 1-01 
Means 1-23 0-80 














The point to be decided is whether sample B may be said to be cleaner than sample A, 
or whether the difference between the two average percentage trash contents may be 
accounted for by random experimental variation. Since, in this case, the comparisons are 
not paired, the standard error of the difference between the mean values of the two samples 
has necessarily to be estimated from the variation within each of the two sets of results. 
As before, normal variation in sampling and in testing errors is assumed. 

(a) The sum of squares of deviations of each set of values from their mean is 0-01680 for A 
and 0-07167 for B, each associated with 2 degrees of freedom. The best estimate of the error 
standard deviation is therefore 0-149, giving 0-122 for the estimate of the standard error 
of the difference between the two means. The value of ¢ is equal to (1-23 — 0-80)/0-122 = 3-5 
which exceeds the value 2-776 obtained from tables for 4 degrees of freedom and a = 0-05. 
On this level of significance the result is taken to indicate a real difference in the cleanliness 
of the two cottons. 

(b) The difference between the two sample means is 0-43 and the mean of the two ranges 
is 0-275. Hence, using the ratio of equation (40), | %,—%,|/}(w’+w") = 1-6 which, from 
Table 10, is seen to exceed the value of 1-272 lying on the 5 % level and therefore is taken to 
indicate a significant difference in the mean values. 








60 Range in place of standard deviation in the t-test 


Example 5. The following strength test results were obtained on two batches of cotton 
yarn (measurements recorded to the nearest } lb.) and are noted downwards in order of 
random occurrence: 














Sample A Sample B 
30-5 31-0 29-5 27-0 28-5 
28-0 31-5 27°5 28-5 25-0 
29-5 30-0 28-0 26-5 28-0 
28-0 27-5 26-0 27-0 27-5 
26-5 29-5 28-5 27-0 27-5 
29-5 27-5 26-5 28-5 28-5 
27-5 28-0 27-0 28-0 28-0 
28-0 32-5 30-0 28-0 26-0 
29-5 28-5 28-5 25-0 26-5 
30-5 29-0 31-0 29-0 28-0 

















The mean of sample A is 28-90 lb. and 27-40 lb. for sample B, and the question arises as 
to whether sample B is actually weaker than A. 

(a) The sums of squares of the deviations about their respective mean values are 68-2 
for A and 24-8 for B, associated with 29 and 19 degrees of freedom. The estimate of the error 
standard deviation is therefore 1-392, giving 0-402 for the standard error of the difference 
between the two means and a value of ¢ equal to (28-9 — 27-4)/0-402 = 3-7. For 48 degrees of 
freedom a value of ¢ = 2-68 lies on the 1 % level of significance. The greater value of 3-7 
yielded by the data above indicates, therefore, that the difference in strength of the two yarns 
may be accepted as statistically significant. 

(6) The estimate of the error standard deviation is obtained from the ranges within groups 
of ten values, three groups for sample A and two for sample B. The values of these five 
ranges are 3-0, 5-0, 5-0, 4-0 and 3-5 with a mean value of 4-1 and a corresponding estimate of 
error standard deviation equal to w(5, 10)/d,) = 1-33. The estimate of the standard error of 
the difference between the two means is 0-384 and the value of w is (28-9 — 27-4)/0-384 = 3-9. 
For five ranges of ten,.the value of u on the 1 % level of significance is, from Table 6, equal to 
2-69 (cf. 2-68 for t with 48 degrees of freedom). The value of 3-9 obtained from the data is 
greater than this value of 2-69 and this again leads to the conclusion that the difference in 
mean strengths of the two yarns is ‘statistically significant’. 


Note added in proof. Since the present paper went to press, a note by Daly (1946) has 
been published, in which it is suggested that the range may be used in place of the root- 
mean-square estimate of variance in a test analogous to the t-test. The case where the 
estimate of standard deviation from a single range is discussed and values of the ratio 
(deviation) /(range) on the 10% level of significance are given to two significant figures 
for a number of low values of n. These agree with the corresponding values given in 


Table 9, for «=0-10, of the present paper. [Mr Lord’s paper was first submitted for 
publication in August 1945. Ep.] 











f 





E. Lorp 61 


APPENDIX 


On the independence of mean and some linear estimates of standard 
deviation in random samples from a normal population 


In the above practical applications of the wu distribution to normal random sampling pro- 
blems, it has been implicitly assumed that range estimates of standard deviation, like root- 
mean-square estimates, are independent of the mean of the sample from which they have 
been determined. The validity of this assumption is established below, where it is shown as 
a particular case of a more general theorem. 

Consider a set of n random values of a variable from a normal population of distribution 





1 res 
p(x) da = Teme"? | -3 =) x. (1) 


Of such a set let x, and x, denote the pth and qth values (p <q) in ascending order of magni- 
tude, and denote the remaining (nm — 2) values such that 


—O<2,<2,, r=1,2,...,(p—1), 
Ly LX, SXq, (p+1),(p+2),...,(¢g=1), (2) 
%q<x,<0, r= (q+1),(q+2),...,” 


Now a set of any 7 values may be arranged in n! ways and in random samples all arrange- 
ments of the same n values are of equal probability. The group of (— 1) values all less than 
x, are not ranked in any particular order, and there are hence (p—1)! ways in which they 
may be arranged. Similarly, the group of values from z,,, to 2,, may be arranged in 
(q—p—1)! ways and the third group from x,,, to x, in (n—q)! ways. The distribution of 
random samples in which the pth and gth values in ascending order are denoted by x, 
and x,, and the remaining values satisfy the conditions in (2), is therefore given by 





n! 
(2, Lg, -..» Ly) O22, ...dz, = (— Ii hagi nexai| 


x Ganga xp| 3; 2g? > = (z,— 6) ded... dry, (3) 


where the constant term in brackets makes the total frequency of all such samples equal to 
unity. 
The joint distribution of the sample mean (Z) of the n values and the difference 4 = (x, —x,) 
is: given by 
n! 
(p—1)!(q—p—1)!(n—q)! 


* aaaaas | [exp -35 2a" ah (x,—&)? | da,dx,...dz,, (4) 


where the multiple integral is evaluated over the domain of the z’s conditioned by the limits 
r=n 
indicated in (2) and by 4 = (%,—2,) and = — © z,. 


r=1 


p(%,A)dzdA = 














62 Range in place of standard deviation in the t-test 
Make the transformation to variables defined by 





ae 
ae.> (ty +2%_+ _ +2,), 

me = Lp + %, 
a> Lp +2, 
PO REE EIT AR ltl Be eee ECT te (5) 

Yp-1 7 Pp + Tp-1 

Yp+1 a —ZXy +2%pi1> 
ag a vy +2, 

. ? ei Az, ... Be 
The Jacobian of the transformation is 





= 1, and, from (2) 
OOF, Yas --+2 Yy—2» Ypta> --+> In) 
and (5), the transformed limits of integration are 

—o<y,<0, r='l,2,...,(p—1), 


0<y,<A4, r= (p+1),(p+2),...,(g—)), 


6 
A<y,<o, r= (q+1),(q+2),...,n. (8) 
y, =A. 
Using the relations in (5) above it may — be shown that 
r=n ze a-I*e =n 9s=n—1t=n—s 
 (#,— EP = w(F—EP+—— De OD Ween (7) 
r= s= = 


where in the summation on the right Ppt With the new variables we have, from 
(4), (5), (6) and (7), the joint distribution of Z and A given by 











ol -*58 2 a 
s ne Jn (n—1)!dA 
paar ts (27) =a aires (q—p—1)!(n—q)! (2m) Yor 


0 0 
x dy, st i dy, af dy n41 we i dyes|- dy oss sf exp| — 5, ono? 


s=n—lt=n-8s 
xin n-I)E¥ ~8 = = vatou [avn |. (8) 


8= 1 
with the restriction that r,s,s+t+, and A is to be substituted for y,. 

The term in the first bracket of (8) is the distribution of the sample mean Z. It follows, 
therefore, that the term in the second bracket is the distribution of A, becav.> this expression 
does not involve Z but is a function of A alone. This indicates that, in random samples from 
a normal population, the difference between the pth and gth values in order of magnitude 
is independent of the sample mean. It follows, therefore, that all estimates of the population 
standard deviation o determined from ranked variate differences (e.g. from the semi- 
interquartile range or other percentile measures of dispersion) are independent of the 
corresponding sample mean. 

As a special case, when p = 1 and qg = n, the difference between the pth and qth values 
becomes the difference between the lowest and highest, i.e. the range of the sample. Further- 
more, if the values in a sample be divided into random subgroups, a simple extension of the 


argument shows that there is also statistical independence between sample mean and the 
corresponding mean range of the subgroups. 














K. Lorp 63 


Table 3. 10% points of u = u(m,n) 











2 | 5-04 | 2-62 | 2-20 | 2-03 1-94 | 1-89 1-82 1-78 1-73 1-71 1-69 | 1-67(1) 
3 | 2-59 | 2-02 1-88 1-81 1-77 1-75+| 1-72 Asti 1-69 1-67 1-66 | 1-66(1) 
4 | 2-18 1-88 1-79 1-75+| 1-73 1-72 1:70 | 1-69 1-67 1-67 1-66 | 1:65+ 
5 | 2-02 1-81 1-75+| 1-73 1-71 1-70 | 1-68 1-68 | 1-67 1-66 | 1-66 | 1-65 


1-94 1-78 1-73 1-71 1-70 1-69 1-68 1-67 1-66 1-66 1-65+| 1-65- 
7 1-67 1-66 1-66 1-65+| 1-65- 

7 1-66 1-66 1-65+| 1-65*+| 1-65- 

9} 1-82 , 1-73 1-70 1-69 1-68 1-67 1-67 1-66 1-66 1-65+| 1-65 1-65- 
. -66 1-66 1-65*+| 1-65+| 1-65 1-65- 






















































































| 
1-78 1-71 1-69 1-68 1-67 1-67 1:66 | 1-66 1-65+| 1-65+| 1-65-| 1-65- 
14 | 1-76 | 1-70 | 1-68 | 1-67 | 1-67 | 1-66 | 1-66 | 1-66 | 1-65+| 1-65 | 1-65-| 1-65- 
16 | 1-75 | 1-70 | 1-68 | 1-67 | 1-67 | 1-66 | 1-66 | 1-65+| 1-65+| 1-65 | 1-65-| 1-65- 
18 | 1-74 | 1-69 | 1-68 | 1-67 | 1-66 | 1-66 | 1-66 | 1-65+| 1-65+| 1-65-| 1-65-| 1-65- 
20 | 1-73 | 1-69 | 1-67 | 1-67 | 1-66 | 1-66 | 1-66 | 1-65+| 1-65 | 1-65-| 1-65-| 1-65- 
; | | - 
Table 4. 5% points of u = u(m,n) 
| 
\™ 
\ 1 2 3 i 5 6 8 10 15 20 36 60 
n\ 
\ 
2 | 10-14 3-87 2-98 2-66 2-49 | 2-38 2-26 2-20 2-11 2-07 2-03 2-00 (2) 
3 | 3-82 | 2-64 2-37 2-25 2-19 | 2-14 2-09 2-07 2-03 2-01 1-99 1-98 (1) 
4 | 2-95+| 2:37 | 2-22 | 2-15- 2:11 2-08 2-05 2-03 2-01 2-00 1-98 1-97 
5 | 2-63 | 2-25+| 2-15-| 2-10 | 2-07 | 2-05 | 2-03 | 2-01 | 2-00 | 1-99 | 1-98 | 1-97(1) 
6 | 2-48 | 2-19 2-11 | 2-07 2-05- | 2-03 2-01 2-00 1-99 1-98 1-97 1-97 (1) 
7 | 2:38 | 2-15+| 2-09 | 2-05+| 2-03 | 2-02 2-01 2-00 1-98 1-98 1-97 1-97 (1) 
8 | 2:32 | 2-13 | 2-07 | 2-04 2-02 | 2-01 | 2-00 1-99 1-98 1-98 1:97 1-97 (1) 
9 | 2:27 | 2-11 | 2-06 2-03 2-02 | 2-01 | 2-00 1-99 1-98 1-97 1-97 1-96 
10 | 2-24 | 2-09 | 2-05-| 2-02 2-01 | 2-00 1-99 1-98 1-98 1-97 1-97 1-96 
| : 
12 | 2:19 | 2-07 | 2-03 | 2-01 2-00 | 2-00 1-99 1-98 1-97 1-97 1-97 1-96 
14 2-16 2-06 2:02 | 2-01 2-00 | 1:99 | 1-98 1-98 1-97 | 1-97 1-97 1-96 
16 | 2-14 | 2-05-| 2-02 2-00 1:99 | 1:99 | 1-98 1-98 1-97 1-97- | 1:97 1-96 
18 | 2-12 | 2-04 | 2-01 | 2-00 1:99 | 1-99 | 1-98 1-98 1-97 1-97 1-97 1-96 
20 | 2-11 | 2-03 | 2-01 | 2-00 1-99 | 1-98 | 1-98 1-97 1-97 1-97 1-96 1-96 
| | l 


























Note. The numbers in brackets in the column headed m = 60 indicate the number of unis which must 
be subtracted in the second decimal place to obtain the level for m = 120 and the same value of n. Where 


no figure is given u(120,n) = u(60,n) to second decimal place accuracy. E.g. for the 5% level, 
u(120, 2) = 1-98. 





64 Range in place of standard deviation in the t-test 


Tzole 5. 2% points of u = u(m,n) 








m 
1 2 3 4 5 6 8 10 15 20 30 60 
n 
2 {25:39 | 6-27 | 4:27 | 3-60 | 3-27 | 3-08 | 2-86 | 2-73 | 2-59 | 2-52 | 2-45+| 2-39(3) 
3 | 619 | 3-56 | 3-05-| 2-84 | 2-72 | 2-65-| 2-56 | 2-51 | 2-45-| 2-42 | 2-39 | 2-36(2) 
4 | 421 | 3-05-| 2-77 | 2-65-| 2-58 | 2-563 | 2-48 | 2-456-| 2-41 | 2-39 | 2-37 | 2-35-(1) 
5 | 3-56 | 2-84 | 2-65-| 2-58 | 2-51 | 2-48 | 2-44 | 2-42 | 2:39 | 2-37 | 2-36 | 2-34(1) 
6 | 3-25*| 2-73 | 2-58 | 2-51 | 2-47 | 2-45-| 2-42 | 2-40 | 2-37 | 2-36 | 2-35+| 2-34(1) 
7 | 3-07 | 2-06 | 2-54 | 2-48 | 2-45+| 2-43 | 2-40 | 2:39 | 2-37 | 2-36 | 2-35-| 2-34(1) 
8 | 295+} 2-61 | 2-51 | 2-46 | 2-43 | 2-42 | 2:39 | 2:38 | 2-36 | 2-35+| 2-34 | 2-34(1) 
9 | 2-87 | 2-58 | 2-49 | 2-45-| 2-42 | 2-41 | 2-39 | 2-37 | 2:36 | 2-35+| 2-34 | 2-33 
10 | 2-81 | 2-55+| 2-47 | 2-44 | 2-41 } 2-40 | 2-38 | 2-37 | 2-35+| 2-35-| 2-34 | 2-33 
12 | 2-72 | 2-51 2-45-| 2-42 | 2-40 | 2-39 | 2:37 | 2-36 | 2-35+| 2-34 | 2-34 | 2-33 
14 | 2-67 | 2-49 | 2-43 | 2-41 | 2-39 | 2-38 | 2-37 | 2-36 | 2-35-| 2-34 | 2-34 | 2-33 
16 | 2-63 | 2-47 | 2-42 | 2-40 | 2-38 | 2-37 | 2-36 | 2-35+| 2-35-| 2-34 | 2-34 | 2-33 
18 | 2-60 | 2-46 | 2-41 | 2-39 | 2-38 | 2-37 | 2-36 | 2-35+| 2-34 | 2-34 | 2-33 | 2-33 
20 | 2-58 | 2-45-| 2-41 | 2-39 | 2-37 | 2-37 | 2-36 | 2-35-| 2-34 | 2-34 | 2-33 | 2-33 





















































m 
1 2 3 + 5 6 8 10 15 20 30 60 

n 

2 450-79 | 893 | 5-49 | 4-43 | 3-93 | 3-64 | 3-32 | 3-14 | 2-93 | 2-84 | 2-75-| 2-66(4) 

3 | 882 | 434 | 3-60 | 3-30 | 3-14 | 3-03 | 2-91 | 2-84 | 2-75-| 2-70 | -2-66 | 2-62(2) 

4 | 542 | 3-60 | 3-20 | 3-02 | 2-92 | 2-86 | 2:79 | 2-74 | 2-68 | 2-66 | 2-63 | 2-60(1) 

5 | 4:38 | 3-29 | 3-02 | 2-90 | 2-83 | 2-79 | 2:73 | 2-70 | 2-66 | 2-64 | 2-2 | 2-60(1) 

6 | 3:90 | 3-13 | 2-93 | 2-83 | 2-78 | 2-74 | 2-70 | 2-67 | 2-64 | 2-62 | 2-61 | 2-59(1) 

7 | 3-63 | 3-03 | 2-87 | 2-79 | 2-75-| 2-72 | 2-68 | 2-66 | 2-63 | 2-62 | 2-60 | 2-59(1) 

8 | 345+) 2-97 | 2-83 | 2-76 | 2-72 | 2-70 | 2-67 | 2-65-| 2-62 | 2-61 | 2-60 | 2-59(1) 

9 | 3:33 | 2-92 | 2-80 | 2-74 | 2-71 | 2-68 | 2-66 | 2-64 | 2-62 | 2-61 | 2-60 | 2-59(1) 
10 | 3:24 | 2-88 | 2-78 | 2-72 | 2-69 | 2-67 | 2-65-| 2-63 | 2-61 | 2-60 | 2-59 | 2-59(1) 
12 | 3-12 | 2-83 | 2-74 | 2-70 | 2-68 | 2-66 | 2-64 | 2-62 | 2-61 | 2-60 | 2-59 | 2-58 

14 | 3-05-| 2-80 | 2-72 | 2-69 | 2-66 | 2-65-| 2-63 | 2-62 | 2-60 | 2-60 | 2-59 | 2-58 
16 | 2-99 | 2-78 | 2-71 | 2-67 | 2-65+| 2-64 | 2-62 | 2-61 | 2-60 | 2-60 | 2-59 | 2-58 

18 | 2-95-| 2-76 | 2-70 | 2-66 | 2-65-| 2-63 | 2-62 | 2-61 | 2-60 | 2-59 | 2-59 | 2-58 
20 | 2-92 | 2-74 | 2-69 | 2-66 | 2-64 | 2-63 | 2-62 | 2-61 | 2-60 | 2-59 | 2-59 | 2-58. 















































Note. The numbers in brackets in the column headed m = 60 indicate the number of units which must 
be subtracted in the second decimal place to obtain the level for m = 120 and the same value of n. Where 


no figure is given u(120,n) = u(60,n) to second decimal place accuracy. E.g. for the 2% level 
u(120, 5) = 2-33. 























65 


E. Lorp 


Table 7. 0-2 % points of u = um, n) 
























































Table 8. 0-1 % points of u 


u(m, n) 


















































eae 
aA Sse _ 
Oo —_—~_SCi 
% Oo yee es mene 
C2 OD SDD «= OH OD GD OD OD «=O. OD. OD. 
| 
2 Soorv wees See 
N MoD eNe «=— OD ED OD OD HD OD OD 
+ | 
6 asee wyyry yee 
~” Hore =e 96 oD 
Lt 
° egoo ooryy tT 
=~ doce Mee MN 
++i 
wo eore “oeey yr¥ 
Wetec Meee MOD 
++ \ 
© Seer eons oy 
WH Mone more 
+ | 
" Seve beres »o7 
CHa Meme Mood 
+ | 
~ 12S OEE Ooo SHH 
Dott DMnnmn Mon 
+ 
° FOre CSSSr FEC” 
Ott Womnmn moo 
a 
| + 
o HrHDS WHMOSS 2 > © 
1 H+eeogqeaedt OM 
+ 
SaoAI HSYOO OAS 
_ ronmo onnndt wad 
oN 
To] 
rg 
NANodo OTrwoao ANS 
a ae N 
i~4 








Note.’ The numbers in brackets in the column headed m = 30 indicate the number of units which must 


be subtracted in the first decimal place to obtain the level for m = 60 and the same value of n. Where 


no figure is given u(60,n) = u(30,n) to first decimal place accuracy ; u(120,n) = u(60, n) for (i) all 0:2 % 


= 3-3. 


points except that u(120, 3) = 3-1 and (ii) all 0-1 % points except that u(120, 2) = 3-4 and u({120,3) 





Biometrika 34 








~ ae 
N on aed 
o — 
oO VAAN St et et st 
Hees OM MD OD DD «| OD DD 
—} OOM WH Meren mmm 
Nn DDD MM MD OD ED OD — ODD OD 
10 Yee GIANT mrs 
_ 2D DD «MD ED MD OD OD «= 0D. ODD 
° BOYD NAAAN AV 
= Hered Denne BOD OD 
! { 
oo Hr oy Seaga aan 
Hee «=o DD OD DOD OD 
© MOOD YR AAa 
Wt Mere 6 oD 
+ 
”" arid yeas OAT 
WH Meee 690900 
+ | 
~ DOSE BOW OA 
Odor NMMMMD MDW 
+ | 
- Yugo LEoeooy Wee 
@QwWtD MNMMO ODD 
+ | 
o TOY WEES YOY 
gow sore es =o en D 
S2YO Pah oe TOP 
GPAAD MWKHHHH WHO 
4 Ve] 
Nn 
, Ando OFTrwvnno NOS 
~ set N 
uae & 
PLS ee 
oana~-= —~— 
—— | = ~—— ~— 





must 
Vhere 


level 





Table 9. Table for testing the significance of the deviation of the mean 


of a small sample (of size n) from some pre-assigned value 
































a 
0-10 0-05 0-02 0-01 0-002 0-001 
n 
2 3-157 6-353 15-910 31-828 159-16 318-31 
3 0-885- 1-304 2-111 3-008 6-77 9-58 
4 -529 0-717 1-023 1-316 2-29 2-85+ 
5 -388 -507 0-685+ 0-843 1-32 1-58 
6 0-312 0-399 0-523 0-628 0-92 1-07 
7 -263 333 -429 -507 ‘71 0-82 
. -230 -288 -366 -429 -59 67 
9 -205- 255+ -322 374 50 ‘57 
10 -186 -230 +288 -333 44 50 
11 0-170 0-210 0-262 0-302 0-40 0-44 
12 “158 -194 -241 277 36 -40 
13 “147 -181 +224 -256 33 ‘37 
14 138 -370 -209 -239 31 34 
15 131 -160 197 | 224 -29 32 
16 0-124 0-151 0-186 | 0-212 0-27 0-30 
17 “118 “144 177 | 201 26 28 
18 “113 “137 -168 191 24 26 
19 -108 131 “161 -182 23 25+ 
20 -104 -126 “154 -175- +22 24 
| 








The table gives values of the ratio = = panne Song a lying on different levels of significance, 
w range in sample 


the levels being the sum, a, of the two tails of the probability distribution. 


Table 10. Table for testing the significance of the difference between 
the means of two small samples of equal size n 












































~ 
Se 
4 0-10 0-05 0-02 0-01 0-002 0-001 
n 
. 
2 2-322 3-427 5-553 7-916 17-81 25-23 
3 0-974 1-272 1-715- 2-093 3-27 4-18 
4 -644 0-813 1-047 1-237 1-74 1-99 
5 -493 -613 0-772 0-896 1-21 1-35+ 
6 0-405+ 0-499 0-621 0-714 0-94 1-03 
7 +347 -426 *525+ -600 “77 0-85- 
8 +306 373 -459 521 -67 73 
9 *275- -334 -409 -464 -59 64 
10 -250 “304 ‘371 “419 53 | “58 
11 0-233 0-280 0-340 0-384 0-48 0-52 
12 -214 +260 *315+ *355+ “44 -48 
13 201 +243 +294 *331 41 *45- 
14 -189 +228 +276 ‘311 -39 -42 
15 -179 216 261 *293 36 -39 
16 0-170 0-205- 0-247 0-278 0-34 0-37 
17 -162 -195+ +236 -264 33 “35+ 
18 -155+ -187 *225+ 252 31 “34 
19 -149 -179 -216 *242 30 32 
20 -143 +172 | -207 *232 29 “31 
, . |%,—%,| _ difference between means , . ; 
The table gives values of the ratio = lying on different levels of 
(w’+w’’) mean of sample ranges 


significance. The levels are the sum, a, of the two tails of the probability distribution. 


N.B. When considering deviations in the positive (or negative) direction only, the values of « at the 
headings of the columns should be halved. 


























f 
































E. Lorp 
Table 11 
n d, 1/d,, /n d,.n 
2 1:1284 0:8862 1-4142 15958 
3 1-6926 -5908 1-7321 2-9316 
4 2-0588 -4857 2-0000 4-1175 
5 2-3259 -4299 2-2361 5-2009 
6 2-53.44 0:3946 2-4495 6-2080 
7 2-7044 -3698 2-6458 7-1551 
8 2-8472 3512 2-8284 8-0531 
9 2-9700 -3367 3-0000 8-9101 
10 3-0775 3249 3-1623 9-7319 
il 3-1729 0°3152 3-3166 10-5232 
12 3-2585 -3069 3-4641 11-2876 
13 3-3360 -2998 3-6056 12-0281 
14 3-4068 +2935 3-7417 12-7469 
15 3-4718 +2880 3-8730 13-4463 
16 3-5320 0-2831 4-0000 14-1279 
17 3-5879 -2787 4-1231 14-7932 
18 3-6401 +2747 4:2426 15-4435 
19 3-6890 2711 4-3589 16-0798 
20 3-7350 +2677 4-4721 16-7032 
REFERENCES 


Daty, J. F. (1946). Ann. Math. Statist. 17, 71. 

Daviss, O. L. & Pearson, E. S. (1934). J. Roy. Statist. Soc. Suppl. 1, 76. 

FIsHER, R. A. (1925). Metron, 5, 90. 

Harttey, H. O. (1942). Biometrika, 32, 334. 

McKay, A. T. & Pearson, E. 8. (1933). Biometrika, 25, 415. 

Pearson, E. 8S. (1926). Biometrika, 18, 173. 

Pearson, E. 8. (1932). Biometrika, 24, 404. 

Pearson, E. S. (1935). The Application of Statistical Methods to Industrial 
Standardization and Quality Control. B.S. No. 600. London: British 
Standards Institution. 

Pearson, E. 8. & Harness, Joan (1935). J. Roy. Statist. Soc. Suppl. 2, 83. 

Pearson, E. 8S. & Harttey, H. O. (1942). Biometrika, 32, 301. 

‘StuDENT’ (1908). Biometrika, 6, 1. 

Tippett, L. H. C. (1925). Biometrika, 17, 364. 








[ 68 ] 


THE FREQUENCY DISTRIBUTION OF 1/b, FOR SAMPLES OF ALL 
SIZES DRAWN AT RANDOM FROM A NORMAL POPULATION 


By R. C. GEARY 


1. INTRODUCTORY 


A research on which the writer has been engaged for some years has so far yielded the 
following results: 

(1) Testing for normality has a greater practical importance than statisticians (including 
the writer) have been disposed to accord to it; actual probabilities may be seriously at 
variance with probabilities derived from the well-known tables computed on the hypothesis 
of universal normality; in consequence, testing for normality and, where necessary, 
correction (even if rough and tentative) for suspected universal non-normality, should 
become a part of statistical routine. 

(2) For large samples, ./b, and b, are the most efficient of large fields of tests of skewness 
and kurtosis, respectively, amongst large fields of alternative universes. 

These matters will be dealt witin in detail in subsequent papers. It seems, in the first 
instance, desirable to derive the frequency distribution of ./b, for normal random samples 
of all sizes, partly on account of the inherent importance of the problem, partly in order to 
explore a computational technique which might be found effective in solving the analogous 
but probably more difficult b, problem. 

Towards the solution of the problem there are available the exact values of first four even 
moments—the odd moments are, of course, zero—of normal ,/b,, the second, fourth and 
sixth having been determined by R. A. Fisher (1930) and the eighth by Joseph Pepper 
(1932). It may be useful here to set out the four moments. Taking 


[by = mg/m} = ni (az) / (2(e,—2), (1-1) 
i=1 


where n is the sample number, we have 


_ 6(n—2) 

~ (n+1)(n+3)’ 

_ 108(n— 2) (n? + 27n—70) 

~ (n+ 1) (n+ 8) (n+ 5) (n +7) (n+9)’ 


He 





/!4 


3240(n— 2) (n* + 84n3 + 2695n? — 15168n + 20020) (1-2) 


Ms =~ (a +1) (n+ 3) (n +5) (n +17) (n +9) (mn +11) (m+13) (n+ 15)’ 





7.5.35. 24(n — 2) (n° + 171n5 + 13893n4 + 580401n3 — 5131014n2 
signe + 14132268n — 12932920) 
= 








(n+ 1) (n+3)(n+5)...(n+17) (n+ 19) (n+ 21) 














R. ©. GEARY 69 


E. 8. Pearson (1931. 1936) derived empirically 0-05 and 0-01 probability points for certain 
values of n > 25 using a Pearson Type VII curve and earlier approximations by R. A. Fisher 
(1929) of the second and fourth moments. 

The method here used for the derivation of the frequency distribution of ,/b, is essentially an 
elaboration of that which the author used (1935, 1936) for finding the frequency distribution of 
the test of kurtosis a (the ratio of the mean deviation to the standard deviation of the numbers 
sampled), which consisted in establishing a relation in integral form between the frequency 
ordinate for n with the value for (n—1) and thereby determining the ordinates to any 
required degree of accuracy for the lower n’s. At a certain stage the actual frequency is shown 
to be very close to the value based on the Gram-Charlier curve for the same value of n; and 
the assumption is made that the Gram-Charlier may be relied on for values of n greater 
than the ‘transition value’. In the present problem the known normal moments are utilized 
as well at every stage. In the concluding section the status of the solution in the hierarchy 
of ‘precision’ is discussed. 

Since the frequency is symmetrical, attention is confined practically exclusively to the 
positive sector. 


2. THE GENERAL INTEGRAL ITERATION 


To distinguish the sample size by the notation let the value of ,/b, be indicated by t,,. Apply 
a Helmert orthogonal transformation to the original observations 2,, 72, ...,Z, 80 that 


xy = (x, —2~)/./2, 
&y = (%,+%,— 2x5)/,/6, 








: 4 (2-1) 
yy = (Uy +g t+... + Ly —N—12,)/y[n(n—1)], 
wi, = (4%, +%_+...+2,)//n = Z./n, 
which, on inversion, gives 
= %  & oS 
mE pt yet * Stan) 
%-f£=>-— Me 4 + a 
. 27 Je" Jfn(n—1))’ 
x. 1. eres. ee 
‘ 6 {n(n —1)]’ - (2-2) 
> ee a, (n—2) 2,» es Tn 
za Titi Jin =1) @—2))* Jinfn—1)]’ 
aes _(n- 1) aya 
Pia Vinm= 





n n-1 
Then zy (uy —7P = > z7?, (2-3) 
i i=1 








Frequency distribution of ./b, 


a 3 


yore * a 
+30i( T+ i +a) 


‘ 


+ aay (a, a+ gti ) 


and 2(x;,-%)8 = aat(5 














+t 
20* 30" Jtjm—1)nj) $ (2-4) 
3-2 Tn 
v[(n—1)n) 
_f 2a a8 _(n— 2) 008, 
6 12-207 m1) a)" | 
Apply a polar transformation to the x’, that is, 
x, = rsing, _, sing, 4... sing, sings, | 
av = rsing,_, sing, 4... sing, cosgdo, 
x3 = rsing,_, sing, _,... sing, cos¢,, 
2“, =rsing,_, sing, 4... sing, cosds, (2-5) 
T-3 = Tsing, 3 sing, 4 CO8P,_5, 
T-2 = TSiNg, 3 COSP, 4, 
Za _4 = F008g, _«, . 
and 2a? = 2(u,—Z)* = 1°, (2-6) 


wwe 


n= \J6 sin’ ¢,,_, sin*¢, _,.,. sin? ¢, sin*¢, cos¢, 


+Jy3sin 3,3 sin’ ¢, 4... sin? ¢, sin?¢, cos¢, 





3 : 3 
+...4 Jim —) (m—1)] sin*¢,,_, sin?¢,,_4 cos bn4+ Tita sin? ¢,,_3 cos, 5 


a . 
-F 7 sin’ ¢,_, sin*¢,,_, ... cos*d,— 72 Png --- Sin? d, cos? d, 


# 2 
ee n™ 5P)n—3 CO8*D,_4— Tein =- CRT 0084, 3, (2-7) 





whence the fundamental iteration 


> 2 
[(n— —1)n} sin Pd... 3 cos $,_3— oh waa cos? $, 3, (2-8) 





in which there intervenes only the mek ¢»-3; and for normal random samples it is a well- 
known fact that the ¢; are distributed independently of one another, the distribution of 
¢$,-3 being of the form Csin"*¢,_,dd,,_ >. (2-9) 


Now #,_, involves only ¢o,...,¢,_4; hence it is independent of ¢,,. Accordingly, if the 
frequency distribution of t,,_, is of the form 


fn—1(tn—1) dt,_1, (2-10) 








the 


asi 


ans» sof De ee oe 





6) 





R. C. GEARY 71 
the joint distribution of ¢,_, and t,,_, is given by 


Csin** Pn—sIbn_g x fn—a(tn—1) dt,_4. (2 11 ) 
Now, from (2-8), 1} 
dt, , = dt, ("5") sin*¢, 5. (2:12) 
On substituting in (2-11) and integrating we find for frequency of t,, the expression 


n—1\?4(n—3)! 
faltn) =("=*) PES (dp, asin "by afa-altea (2-13) 
where the relation (2-8) obtains. chee extends to values of ¢,,_, (so that 0<¢,_,<7 
for n>3) which yield non-zero values of f,_,. Setting cos¢,_, = 2 the integral at (2-13) 


assumes the form 
n—1\4 
faltn) = (MSA) HOR faeta— 240-94, Ut, (2-14) 


ith, fi 2- 
with, from (28), alt, = [(m— 1)! fy — 80+ (m+ 1) a8]/(1— 2), (2-15) 
In the derivation of the frequencies for n = 4 to 8 inclusive, dealt with in later sections, both 
the forms (2-13) and (2-14) are used. 


3. FUNCTIONAL DISCONTINUITIES OF THE FREQUENCY 
In the integral at (2-14) ¢,, appears merely as a parameter. Consequently the nature of the 
frequency /,,(t,) depends to a considerable extent on the simple algebraic properties of 
_4(x) given by (2-15). The following property (easily demonstrated) is fundamental: 


For t, = (n—2k)/[k(n—k)}* = ,7, (& = 1,2,...), (3-1) 
t,_1(%) has a maximum value of 
(nm — 2k + 1)/[(k—1) (n—k)}* = pyTr-1 (3-2) 
for a = —[(n—k)/k(n—1)}* = 5£,, (3-3) 
and a minimum value of 
(n—2k—1)/[k(n—k—1)]* = ,7,_, (3-4) 
for x = [k/(n—1) (n—k)]}* = ,£%. (3-5) 


DEFINITION. ,7, are termed the link values or links of t,,. The regions between consecutive 
links are termed zones. The graph of t,_,(x) for -—l1<a< +1 and t, = ,7,, (given at (3-1)) 
is illustrated in Fig. 1. The limits of integration for integral (2-14) are now seen to be ,A;, and 
Aj, which are the values of x at which the ordinates of the curve (2-15) in (z,t,_,), with 
parameter #,, = ,,7,, assume the limiting values +,7,, and —,7,,,. The scale on the right 
shows the links of t,_,. The curve t,, = ,7,, traverses all the zones but has a ‘turn’ in the 
(k—1)th zone, remaining entirely in the ‘zone the while. It is due to this turn that the 
phenomenon of functional discontinuity manifests itself in the frequency f,,(t,,). 

Assume that within the kth zone the frequency f,_,(t,_,) is represented by ;f,,_1(tn_1), 
different in functional form for different values of k but the same (for example, having the 
same coefficients in a power series) within each zone. It will at once be evident, from (2-14), 
that the frequency of t,, will have a like property. Now, from (2-4) and:(2-5) it will be seen that 


Rou ee (3-6) 








72 Frequency distribution of ./b, 


the distribution of ¢, is rectangular, so that the distribution* of t, is given by 


-_ 
and zero for | t, | > 1/,/2. It follows that t, has a functional discontinuity at its links + 1/,/2. 
Hence, by iteration, the frequency of t,, is represented by different functional expressions in its 
different interlink zones. 


Link of 


tat 
‘ 
iTe-1 
he 271 
at 



































&’ uv au 
& Sa : ” 


—__ Curve (2°15) for t,=,7, 
---Curve (2°15) for eS 
—=-Curve (2°15) for t,=7-4 

















a Tn -1 
Fig. 1. Graph of t,_,(z). 


That the frequency has a finite limit ,7,, (when 7 is finite) is established as follows. It can 
easily be seen from (2-15) that when t,, = ,7, the curve t,_, = t,_,(x) degenerates into (i) the 
straight line x = —1 and (ii) a section above the straight line ¢,_, = ,7,_, but touching it. 
Fort,, > ,7,, no part of the curvet, _, = t,_,(x) falls within the rectangle z = + 1,t,_, = + 3Tp-1. 
Reference to (2-14) shows at once that, iff, _,(t,_,) = 0 for | t,_, | > 7,_,, then f,,(t,) = 0 for 
|t, |>4 7, But (3-7) shows that t, has as limiting values + 1/,/2. Hence, by iteration, it 
follows that the limiting values of the frequency of t,, (or simpliciter of t,,) are 


+ yTn-1 = + (n—2)/(n— 1). (3-8) 
* R. A. Fisher (1930). 

















R. C. GEary 73 


As will presently appear, the frequencies for n = 4 and 5 have marked irregularities: 
successive integration in accordance with (2-14) imparts, of course, a progressively increasing 
degree of smoothness to the frequency. To give mathematical expression to this feature, 
recourse is had to the idea of order of contact. 

DEFINITION. T'wo functions are said to have contact of order ,y,, at link ,7,, if the functions 
and their first (,y,—1) derivatives are finite and equal at the link. It can be shown without 
difficulty that 

bYn = -1¥n-atl, (3-9) 
when k>1, n>4. For what follows 1t will be convenient to set out for the smaller sample 
numbers the values of the links and their orders of contact. The links for positive values 
only of the variables are shown. The orders of contact ,y,, will appear from a proposition 
proved in §5, giving the actual values of the frequencies near the limit of range. The non- 
diminishing smoothness in the direction of the centre of the range will be noted. 


Values of ,7,, and yy,, for n = 3 to 8 inclusive 

















Ist link 2nd link 3rd link 4th link 

n 

1Tn 1Y1n a™n 2Yn aTn 3Yn aTn 4Yn 
3 1/,/2 0 ~_ — — — — — 
4 2/./3 0 0 0 — — — — 
5 3/2 1 1//6 1 -- = os -— 
6 4/,)5 1 1 2 0 2 —_ — 
7 5/./6 2 3/./10 2 1/2./3 3 _ — 
8 6/./7 2 2/./3 3 2/./15 3 0 4 





























For even values of n the origin is always a link. In the determination of the frequencies 
for n = 5 to 8, by the methods described in subsequent sections, the link ordinates and the 
central ordinate play a cardinal role. In fact, the method will be seen to consist essentially 
in finding curves which pass through the central and link ordinal points, have the required 


orders of contact and the required form at the limit of range and have the exact earlier 
momental values (see first section). 


4. THE FREQUENCY NEAR THE CENTRE OF RANGE 
It will first be shown that 
fi(+0)=0 for n>4. (4-1) 


In fact, from (2-14) and (2-15) ift,, = w, a small positive quantity, 
+A—Ku=A”" 
fa) = Cf det —atyo-nf,_ ty (2) 
—A—Kxu=2’ 
A, x being positive constants. Hence 


fru) = — Cuf{(1—A"2)Kn—O Ff, _ [ty _a(A")]— (1A) Falta a (A')]} 


+0" del —23419f5_ a2) 








74 Frequency distribution of ./b, 


Letting u-—> 0 the integral-free expression obviously vanishes provided that f,,_,[t,,_,(A)] is 
finite, which it is when n > 4; and the integral becomes 


+A 1 23 — 3. 
da(1 _ | sania) PEL ae 
—A 1—2? 


Since f,_,(y) is an even function of y, its derivative is odd which remains an odd function 
when y is réplaced by an odd function of x. Hence the integral vanishes. 


5. THE FORM OF THE FREQUENCY AT THE LIMIT OF RANGE 
In this section it will be shown that near t,, = + (n—2)/(n—1)} the frequency is given by 





_ 1 n—-3)! (n—1)- my fe i 
Fnltn) — 3 Ja }(n—4)! (3n.n—2)Kn—® ( cf th . (5 1) 

It may be seen at once that for n = 3 the se by (5-1) would be 
fa(ts) = - *( — )- (5-2) 


as at (3-7). For n = 4, (5-1) gives }./3, which is the value found by A. T. McKay (1933). 
The general theorem will be proved by iteration. We assume a general form 


n—3° H(n—5) 
fn—al 1 ty 1) = C,_ (5=5-44) ’ (5-3) 


and show that a similar form emerges for f,,(é,,), finding incidentally an iteration relation for 
the constant C,,. First set 


v =n—2-n—1'*t,, 
and assume that v is a positive quantity. It will readily appear, from (2-15), that, for v = 0, 
t,-3(z) has a double root at x = 1/(n—1). Accordingly we set 





x=2'+1/(n-1) (5-4) 
(n—3)? _ 4s ‘ 
and X= (n—2) —#_,. (5-5) 
Having regard only to principal terms we find 
ial n(n — 2) 
1~ set ie (5-6) 
X =2n-*(n — 1)3 (n— 2)-2 (n — 3) (1-272) v!, (5-7) 
2n—2\3 
. , ge ee dap” i 
with x — viz”. (5-8) 
Now, from (2-14), 
falta) = (“So ) HER8 MOF, lb 4) 
nvn ™m n—1\"n—1/> 





and, from the analysis in § 3, it will be clear a there are two separate parts of the domain D : 


(I) a part near x = 1/(n—1) for which ¢,_, is entirely in the first zone and by hypothesis 
has the form (5-3); 


(II) a part near x = — 1 in which ¢,,_, assumes all values. 


* The symbol = signifies ‘equals, to required approximation’. 








wl 





on 





R. C. GEARY 75 
Let Finltn) = fitn) + fPtn): (5-9) 


where the functions on the right represent the contributions accruing from the respective 
parts of the domain of integration. Then 


a, | — ]\# f2"=+1 
Fultn) = ate (=) ‘ sr be at — 28)Kn—OC,_,{2n-*(n — 198 


x (n — 2)-* (n— 3) (1 —2") vt}er—9), 


which, on a change of variable from x to 2” by (5-4) and (5-8) and integrating in x”, becomes 








3 i(n—4) 
=C 4 b(n — 3)! $(n—5)! —n—2) (9 — 12-3 (9 — 2){n—® (9, — 3) Kn) (M2 
Ch13 3(n—4)!? n (n 1) (n 2) (n 3) n—-1 n > 
(5-10) 
which is of the required form. As regards the contribution of IT, in (2-15) set 
+1 = po+yrt. (5-11) 


It will be found that when ?,,_, has its limiting value +(n—3)/(m— 2)! the vanishing of the 
terms in v? and v® gives 
2 -/2 (n—8) 
ve ab 2 dae pate 
f= jn and y ace 


whereas if ¢,,_, has its limiting value — (n—3)/(n — 2)! the values are 


2 /2 (n-3 





Between the limits of x, 





v 2 /2 n-3 
a ale pp) a AN . 
it < + Jaana , (5-12) 
t,,(a) given by (2-15), assumes once all values between its limits of range and, in fact, 
2 /2t,, ; 
” is 3 nt assis 
a ES 
Now (=) ar 
ty ) da(1 —a?)K"—D fF _o(t,,_1); 5-14 
fit.) Talay ( yeo— f,,-1(b a1) (5-14) 


the limits of integration in x being given by (5-12). By (5-11) and (5-13) change the variable 
x into t,_, (via y) when (5-14) becomes 


fit.) = C(n) ole fF, altn) dt,,_1, (5: 15) 


and the integral on the right is unity. Written in full (5-15) then becomes 





th 


Fitltn) = 


1 }(n—3)! (n—1)- ees (6-16) 


#(n—4) 
3 /an43(n—4)! (3nn—2)kr—® “a ) 


There is no difficulty now in proving by iteration from (5-10) and (5-16) that the constant 
has the form indicated in (5-1). Note that, in (5-9), f! accounts for (n—1)/n and f® for 1/n 
of the total frequency. 








76 Frequency distribution of ./b, 


6. SAMPLES oF 4 
From (2-14), (2-15) and (3-7) the frequency for n = 4 is found to be 


Salts) = Bl dey, (6-1) 

where y = 2(1—2*)8—(v—32+ 525)? with v= /3t,. (6-2) 
D is the range of values of x which give non-negative values for y with |z|<1. Now 

y = —(3x2— 1)? (322 — 2) — 2v(5a3 — 3x) — v2, (6-3) 


from which it appears that when v is small y has two real roots near — 1/,/3, two imaginary 


roots near* + 1/,/3, and single roots near + ./2/,/3 and —./2/,/3 accounting thus for all six 
roots. With O(v!) = 0 the four real roots are 


1 
— 75 tavt. with a? = —- 


7 
eS. 
“v3 9 
Hence the integral at (6-1) may be written as the sum of five integrals 
—Vi+els -1V3-avt —:1/-/3—avt Uv3+avt =p vi—v/9 
AU SONG: WME: MC INGE 
—vi-v/9 —Vi+0/9 -Uv3+avt / 1/V3—avt 1/V¥3+avt 
Fig. 2 illustrates the division of the region of integration D. There are five divisions, 


numbered I-V, in what follows, in which we regard as ‘principal terms’ only those in log v 
and the constant term. Terms in v' will ultimately be ignored: 


vee 
I= [- O(vt), 
vi- » 


-1)y3—aet 
=| was [act —1+305)-3 (2-324) 


—Vi+0/9 
6 vi—v/9 
=7 A (¥ v)! — 3a%y — log sate) = | =V 


1/V3+avi 
+1/V3—avt ] 3a2v 
= x= ae 2\—1 a Pe itty wens esas hiniheapiieatil 
Il | geet i) dx(1 — 3a) (2— 322) 4x — 5 “glee (isa): 
1/V3+avt +1 
v= | a124{ (2'84.1)-4de’ = 1 sinh-11. 
1/V3—avt = 3 
Neglecting O(v') we accordingly have 
1+114+11+IV+V=-— log 3a%v + — 1 inh-11 420. (6-4) 


B B 


The constant 2C derives from additional terms in integrals II, III and V: 
z 
+ ara 
Ili = { da{(1 — 3a*)? (2 — 3x*) — 2v(5a — 32) — v?}-4 
a 
with y = 1/J/3—avt. 


* Tn a sense which will be obvious from Fig. 2. 





ele 





4) 








R. C. GEARY 77 


We have already taken account of the term in III found when »v is zero. The constant C 


derives from the even powers of z in the formal expansion of the denominator of the integral 
element—the gdd powers vanish by symmetry. Setting 


1 
x= ya” 1—32°=2,/32', 2--329=1, 5a°-3e=-4$,/3. 
On expansion of IIT, 


« 1/v3 
C2=2> dar’ xe’—4#—-1 (2y)%* (2 ,/3)-**-1 C,,, 
k=1J avt : 
where C,,, are the even-order coefficients in the expansion of (1 + z)-+, i.e. C,, is the coefficient 
of z**. On integration we are interested only in the value at the lower limit av, for all 
terms at the upper limit (and certain terms at the lower limit) are O(vt) at least. Hence 
1 


=e is - 
C 7B Cull 


x 





L i 1 
~T-E-& FE EE 0 
Fig. 2. Graph of y (see (6-2)). 


x 
a 


gee | 





wat 
nm 

re) 

ft 


Note. This diagram is designed merely to give a general idea of the limits of integration. It is not 
drawn to any scale. Following are the values of the &: 


&, = 3 - = * 
fi = J§+/9 = 1//8+av os 
gf = yi—0/9 &=1 8 all es 


Similarly II + V also yield C giving a constant additional term of 2C. 


12g 1 


Now C= 32, 4k ~ 3 


1 
[ Fiar+oat+... 

9 & 

1 ff! dx 1 f'de "2 ve 
oe i ee ari 2)-44 (122) 
al. rtp) ettte eee 
(14+22)!-1 


LT ge tng EPH og eT 
-+| - ogz+t tlog yaya t? eiy (1-2) ac 
1 


= vg{log 2+ } log (24—1)}. (6-5) 





78 Frequency distribution of ./b, 


All logs are to base e (unless otherwise indicated in what follows). Hence 


fa(t,) = 0-372646 — = log v (6-6) 
= 0-285222 — 0-366466 logy, | t, |, (6-7) 
since v= /3ty. 


A. T. McKay (1933), from a different approach, gave the log term in (6-7), and, as a rough 
approximation to the constant term, the value 0-311568. He also showed that an expression 
of the form (6-7) accounted for most of the frequency, a fact of great importance. Assume 
that the residual term is of form 
A|t|t+B|t|, 
and find A and B from 

(i) f,(2//3) = 4/3 (from (5-1); also McKay (1933)), 
(ii) total frequency is unity, 
giving 
Salts) = 0-285222 — 0-366466 logy, | t | — 0-009178 | t, |! + 0-031359 | t, |. (6-8) 
For algebraic manipulation at the next stage the form of residual A’ | t, | + B’t? will be found 
more convenient, however, with A’ and B’ also determined from (i) and (ii). In this form 
Salty) = 0-285222 — 0-159155 log | t, | + 0-014275 | t, | + 0-00739822. (6-9) 
Note the smallness of the coefficients A, B, A’ and B’ in (6-8) and (6-9). 

In the following table the first four even moments as derived from frequencies (6-8) and 
(6-9) are compared with the actual values as derived from the formulae (1-2). Both formulae 
yield excellent approximations, with (6-8) always superior to (6-9) however. Either formula 
can obviously be used with complete confidence for deriving the probability points. The 


frequency graph in Fig. 4 is derived from (6-8) which should also be used for the computation 
of the probability points. 























Formula 
Moment Actual 
(6-8) (6-9) 
bs 0-342857 0-342930 0-342470 
Ms 0-258941 0258979 0-258606 
Ls 0-240503 0-240263 0-240205 
Ls 0-245940 | 0-245949 0-246662 








7. SAMPLES OF 5 


After many computational experiments the method used for determining the waqneeny 
f;(t;) was as follows: 


(1) Using (2-14) with form (6-9) for f,(t,), central and link ordinates, i.e. f,(0) and f,(1/,/6) 
were computed. 


(2) The approximate value of f,(t;) near t; = 1/,/6+0 was found in the form 
Js(1//6) + M(t; — 1/,/6)*, 


M being known. 








Ww 





1 





R. C. GEARY 79 


(3) The two zonal curves were found (i) passing through (0, f,(0)) and (1/,/6, f,(1/./6)) 
with f;(0) = 0 and (ii) passing through (1/,/6, f,(1/,/6)) and with the required form at 1/,/6+0 
(i.e: as at (2) above) and at the limit (3 — 0) so that 7) (= 1), #, 4, and yw, have the exact values 
as given for n = & by the formulae (1-2). 

Setting then 


fa(t,) = 0-285222 — 0-159155 log, | t, | + R(t,), (71) 
, 6 (x3 — }a+ Hs) : 
with ~ 15 (1-2) 9 (7 2) 
and R(t,) = 0-014275 | t, | + 0-007398¢3, (7-3) 
4 [+ _dx 
we have fultd) = 575), agalalto (7-4) 


the limits of integration being A (negative) and yu (positive) which are the values of x, from 
(7-2), corresponding respectively to t, = —}./3 and t, = + }./3. We shall be concerned only 
with the case t, < 1/,/6 when t,(x) has three real roots f, a and y of which f is negative and a 
and y are positive. For (7-4) the following are required 


’ de, 1+p1l—A ", 
( = tlog (=): 7 (7"5) 
2 
> — = Hlog? (1 + 1) — log? (1 + A) — log? (1 — x) + log? (1 — A)} 


+ {log (1 + 2) log (1 — yu) — log (1 + A) log (1 —A)} 


Sr oa tid i(*3"). (7-6) 


— Hog (5 my log (14a) (1+ A) (+r +5{ ¥ Mert ¥ Ila, (77) 
+ fh 2\i% j=7 j 
with K, = (2—A)/(1—«), kK; = (B—A)/(1-A), Ky = (y—A)/(1+y), 
Kp =("#—a)/(l+a), Ke =(H—-A)/(1+A), Kyo = (HY) /(1—Y), 
Kg =(y—A)/(I-y), Ky = (@—A)/(1 +2), = (B- A)(L+ A), 
Ky=("—-y)/(L+y), Ke =(H-a@)[(1-@), Kye = (H—A)/(1—A), 
and I(x) = [’ “oe =BE = log k log (1+ «)— (x), 
J(x) = [: —— —logk log (1 —k)—¢(k), 


when x<l. 








80 Frequency distribution of /b, 

It is useful to note that ¢(1) = 1-644934 = 2y(1). The functions ¢(«) and (x) do not 
appear to be tabulated. By fitting curves to their values for equally spaced intervals of 0-05 
from 0 to 0-5 the following very close approximations are found, applicable for x < }: 

(x) = 1-000567x + 0-2334542? + 0-1860522%, 
yr(x) = 09998352 — 0-2442202? + 0-0770242%. 
When 1 >x > } the following formulae can be used: 
P(x) = (1) —log « log (1 -«)—¢(1—x), 
W(x) = $9(1) + log x log (1 + x) — (1 —x) + $(1 —&?). 


When x >1 we use 
P(K) = 24(1) — $ log* 1/K —G(1/k), 
(x) = 2y(1) + $ log? 1/K— y(1/x). 


P(k) = P(x) + $h(K?). 
The algebra of the contribution to (7°4) from R(t,) is without mathematical interest. 


From the formula the following were the values found for the central frequency and the 
second link frequency: 


Another useful formula is 


fs(0) =0-606563; f,(1/./6) = 0-599069. (7-8) 
The moments, computed by (2-1), with n = 5, are 
Mo =1, fz = 0-375, fg = 0°361607, ug = 0-474609, zg = 0-719382. (7-9) 


Computation by approximate integration of certain of the ordinates gave evidence of 
marked irregularity near the link ¢, = 1/,/6. In consequence, it seemed desirable to try to 


find a term (in addition to the constant given 9 (7: 8) of the expansion of f,(,) near 1/,/6 + 0. 
Setting 
t, = att (7-10) 


where ¢ is small and positive—we shall be interested only in a term in t#—we find 


4 v—at* B+p ‘\ dx 
Ft +| #{1— ga falts) tia 
The values v + At* are the abscissae of the points at which the curve t, = t,(x), given by (7-2), 
intersects the ¢, link line ¢, = 2/,/3 near z = v = —}./6. It can easily be shown that 
6 


3 ° 
A? =%,. (7-12) 


We are not concerned with the values (A +A’t) and (u+,’'t) which are the abscissae corre- 
sponding to the intersection of t, = ¢,(x) with t, = —2/,/3 and its third intersection with 

= 2/,/3. Remembering that at the latter link f,(¢,) has the value 1/(2./3), the integral- 
free term in t+ in the first derivative f;(t;) of f,(t;) is 


4 _16A 15 
shi *{flo— Att) x —}At-4+f,(v+ At) x —}At-} = 4 


v+A 


“4, (7-13) 


Also we have to consider the integral term in f;(t;). For this purpose, from (7-10) and (7-2), 


= (e+ VD (2 - 4) +3)0-2 
=F(2+V84 a)\(2 -%- 5) +9} ay. 








ar 


A 





ot 
05 


st. 
he 





R. C. GEARY 81 


Remembering (7-1), it can be shown that the only term in (7-11) from which a term in # 
can come is approximately 


4 1/" dx Es ei v6 
~ Bn T= 8\(*- Je) 94: ma 
A and y, the limiting values, being respectively negative an“ positive. 
Differentiating (7-14) in respect of ¢ we find 





2 Be d. -] 6 )-1(,/6 2 
RB ee ge ge a mee 
mJ5J,1—a*\\ 6 9 to} \9 9\" Jé 9} 
Changing variables b oe oes | . tt 
ging 8 by x ats = hy 
and letting ¢ tend towards +0, we find for the term in t+ 
2 (76t2 
AS —. f-3 . 
arn = ). (7-16) 


Adding (7-13) and (7-16) we find 


: , : 1 \} 
On integrating we find for the term in # = (1.- 6) 


6 
| aoe 23,2 1 \t $ 
~ae 2*3 (1-4) - —0: 594117(t, -%) - (7-17) 
From (5-1) the value of f,(z) near x = 3 is 
0-219166(3— z)t, (7-18) 


where x is usually written for simplicity instead of t, in the remainder of this section. 
Having regard to (7-8) and to the fact that, from § 4, f;(0) = 0, in the half-zone (0— 1/,/6), 
f(z) = F(x) must be of form 


F(a) = 0-606563 +.a,27+a,2°+a,24. (7-19) 


The first relation between the coefficients is found by giving expression to the fact that 
y = F(z) passes through the link-point (1/,/6, 0-599069): 


0-166667a, + 0-068041a, + 0-027778a, = — 0-007494. (7-20) 
In the zone (1/,/6 — $) assume that 


f;(x) = G(x) = —0-504117(2— 5) + 0-219166(3 — x)* + 6b) +b,($ — 2) 
+b,($—x)?+b,($—x)*, (7-21) 


designed to conform with requirements (7-17) and (7-18). Since y = G(x) must pass through 


($, 0), 1\)- 
i= o-594117( 3 - z) — 0-620775. (7-22) 


Taking the value of b, into account and giving algebraic expression to y = G(x) passing 
through (1/,/6, 0-599069), we find 
1-091752b, + 1-191922b, + 1-301283b, = — 0-250706. (7:23) 


Biometrika 34 6 





82 Frequency distribution of /b, 


To find the six coefficients a,, a3, a, (in (7°19)) and b,, bg, bg (in (7-21)), we have, so far, found 
two equations, (7-20) and (7-23). The remaining four equations are found by equating the 


total frequency to unity and the first three even moments to their true values given at (7-9), 
i.e. setting 


1/V6 3 
bio, = { deat Fa@)+{ dax** G(x) (k = 0,1, 2,3). 
0 1/v6 
On substituting for a, given by (7-20), for b, given by (7-22) and for 6, given by (7°23), we 
find the four equations in a,, a3, b,, b,: 
0-079072 1a, + 0-07139999a, + 0-297981b, + 0-108441b, = — 0-071173,) 
0-064802a, + 0-03110237a, + 0-268332b, + 0-082590b, = — 0-066293, | 
0°045901la, +0-04107175a, + 0-303248b, + 0-079086b, = — 0-076019, 
0-0°6368a, +0-0°1169a, + 0-400090b, + 0-089674b, = — 0-101300. 


On solution (and checking by substitution) the coefficients are found to give finally the 
following frequencies: 


(7-24) 





Zone f(x) 
0 — 1/,/6 : 0-606563 — 0-3307z? + 3-195523 — 6-112924, 


1/./6 — 3 : 0-620775 — 0-594117(2— %) + 0-219166(3 — x)! — 0-268273(3 — 2) | (7-25) 
+ 0-067263(3 — x)? — 0-029195(3 — x), 
with x = t;. . 

The extremely interesting form of the frequency curve may be observed from Fig. 5. In 
the first half-zone the frequency shows but little variation: the curve declines to a minimum 
of 0-6058 at 2 = 0-0894 then rises to a maximum of 0-6136 at 2 = 0-3027. It then recedes to 
the link 1/,/6, where it assumes the value 0-5991. As one type of check on the reliability of 
the results in general, some ordinates were computed directly (i.e. using (7-4) and (7-1)), 
or by approximate integration using (7-4) and (6-8) and compared with the ordinates 
computed from (7-25) to the following effect: 











Value of frequency 
Trial value 
e% By approx 
integration By (1-28) 

0-15 0-6069 0-6068 
0-3 0-6106 0-6136 
0-6 0-3650 0-3603 
0-9 0-2232 0-2308 
1-2 0-1377 0-1371 

















Except perhaps for the frequency at t; = 0-9, the correspondence is satisfactory; there can 
be little doubt that the more accurate figures are those from (7-25). 

As a stringent test of the accuracy of the frequency the 8th moment ~, was computed 
from the empirical curves at (7-25) and compared with the actual value given at (7-9): 


fg =0°7191, pg = 0°7194, 











Eve 


T 


a, Ss a ee ae eee 





ae i i A al 


we 





R. C. GEARY 83 


Even as the figures stand the check is decisive: it should be added that the 4th place of 
decimals in pv, is suspect to the approximation used. 


8. SAMPLES OF 6 
In this case the links are 0, 1/,/2 and 4/,/5, and the link frequencies at the first two were 
found by approximate integration using form (2-13) with t, given by (2-8). For this purpose, 
drawings were made of the two sections of. f,(t,) on a scale sufficient to ensure that an 
ordinate read for any abscissa would be correct probably to the 3rd place of decimals. For 
intervals of 1°, values of t, were computed over the whole range by (2-8) (for t, given), and 
graphically the value of f,(¢;) was read off for each t;. Hundreds of readings had to be made, 
but actually the work, with a little practice, was rapid and accurate, the entries being prac- 
tically self-checking. The Gregory formula (using 2 correction terms) was used to give the 
following results: fg(0) =0-6889; fg(1/,/2) = 0-3247. (8-1) 
The two zonal frequency curves, say y = F(x) in (0—1/,/2) and y = @(z) in (1/,/2—4/,/5), 
writing x instead of t, must have the following properties: 

(i) F(0) = 0-6889, 

(ii) F’(0) = 0 (§ 4), 
(iii) F(1/,J2) = @(1/,/2) = 0-3247, (8-2) 

(iv) F’(1/,/2) = @’(1/,/2) (§3), 

(v) G(x)=5(4/,/5 —2)/36 (from (5-1)).' 





The etirves were 
F(x) = 0-6889 + a.27 + a,2° + a,2*+a,2%, } 
G(x) = 6(2—2x) + bo(8 — x)? +b,(8 — x) + 64(2—x)* with 2 = 4/,/5. 


The exact moments are 


(8-3) 


Uy =1, fz = 0°380952, jug = 0-409191, ug = 0°642924, peg = 1-219892. (8-4) 
It is proposed to compute the seven coefficients in (8-3) using (8-2) and (8-4). Now, with 


-a curve of the type of f,(x), where much of the frequency is at the ends it is evident that the 


contribution from the zone (0-— 1/,/2) to the higher moments 4, and ju, is exceedingly minute: 
this property was utilized to divide the single series of seven equations into two series of 
three and four equations using the following device: Approximate F(x) by a curve F,(x) 
given by F(x) = 0-6889 +a;,22, (8-5) 
finding a; simply by passing y = F(x) through (1/,/2, 0-3247) giving a, = —0-7284. If ,, is 
the moment, let s,, and 3, be the contributions from F(x) and G(x) respectively so that 
Hos = tag + Hg. Let v3, be the estimate, using F,(x), of 43,. For s = 2,3, 4 the values are 
v, = 0-030318, % = 0-009360, 9% = 0-003397, 


‘which, subtracted from the corresponding //,, given by (8-4), give very close estimates of 
/4,, Which involve only bg, bs, 6,. The equations in order 


(1) 1-170178b, + 1-265837b, + 1-369316b, = 0-174500, 
(2) 0-726980b, + 0-378949b, + 0-235502b, = 0-058771, | (8-6) 
(3) 1-260067b, + 0-534702b, + 0-289663b, = 0-091217, | 


are found from (iii) at (8-2), from yg and from j3. 





84 Frequency distribution of ./b, 


The equations in the a’s are 


(4) 0-5a, +0°353553a,+0-25a, +0-176777a, = —0-3642, 

(5) 1-414214a, + 1-5a, +1-414214a,+1-25a, = —0-653179, 
(6) 0-117851a,+0-0625a, + 0-035355a, + 0-020833a, = — 0-115790, 
(7) 0-035355a, + 0-020833a, + 0-012627a, + 0-007813a, = — 0-031433, 


where (4) is from (iii) at (8-2), (5) from (iv) at (8-2), (6) from the total frequency = 4 and 
(7) from variance = //. 
The solutions of (8-6) and (8-7) yield the following frequencies: 


Zone f,(x) 
O—1//2 :0-6889— 1-2715a? — 2.607323 + 9-566924 — 6-779025, 


1/J2—4/./5 : & (B— x) + 0-047068(8 — x)? + 0-024897(2 — x)? + 0-064198( — x)*, 
2 with B = 4/,/5, 


(8°7) 


(8-8) 


with x = tg. 

As a check, the 4th-moment computed from the foregoing curves gave 0-4108 as compared 
with the actual ~, = 0-4092, an error of 0-38 %. This is not of any importance from the view- 
point of the computation of the probability points, but it illustrates how, using the integral 
iteration method as generally in this paper, the momental check reveals increasing dis- 
crepancies with increasing n. 

It might be thought that by constructing empirically ‘almost any’ symmetrical frequency 
curve, so that say the 0th, 2nd and 4th moments have the true values, we shall ensure that 
the subsequent even moments computed from such an empirical curve will approximate 
closely to the corresponding true values. That this is not the case may be seen by computing 
the 6th and 8th moments by the well-known Karl Pearson iteration formula,* where 1, 

and y, have their true values, for ./b, with n = 6: 








Karl Pearson Percentage 

Acvasl iteration discrepancy 
= pel 11-6291 12-4984 + 10-7 
6 = Melee 57-9214 73-3990 + 26-7 




















Even when, in the Pearson iteration for £,, one gives f, the correct value, we find a per- 
centage discrepancy of 17-9. These percentages place in perspective the minuteness of the 
percentage errors found in using the higher momental check as it is used throughout this 
paper. 

It is an interesting question of general import whether in work of this kind the arduous 
and potentially erroneous computation (by integral iteration) of the central and link 


frequencies could be dispensed with, and reliance placed entirely on the moments, together’ 


with the functional properties of the frequencies, which, of course, merely represent an 
elaboration of the Karl Pearson approach. In this connexion a couple of experiments were 
made on the ,/b, frequency for n = 6. 


* Tables for Statisticians and Biometricians, Part 1, 2nd ed., p. xi. 








Po 


in 





a] 


= 





R. C. GEary 85 


For the first experiment, the two zonal curves were assumed to have the correct order 
of contact, the correct form, (8-2)(v), at the limit of range and the correct values of 
Moy (= 1), He, Mg and y,. The equations are 


Zone 
O-—1//2 : F(x) = 0-659844 — 1-0756182? + 0-55599123, 
1/,/2—4/./5: G(x) = 33(8 — x) + 0-080560( 8 — x)? — 0-085469( 8 — al (8-9) 


+0-133119(8—)* with £ = 4/,/5. 


This gives a central frequency 0-6598 compared with the computed frequency (by (8-1)) of 
0-6889. In all the circumstances the difference is not important. The 8th moment, y;. from 
(S 2), is 1-217706, or — 0-18 % in error. 

The second experiment contemplated the frequency as a single-curve system with correct 
first derivative (—%;) at the limit and with correct MW, Mg, 44; tg. The curve is 


F(x) = 0-669426 — 1-510972? + 1-5385423 — 0-605452* + 0-08515725, (8-10) 


which has the properties: (i) the central ordinate 0-6694 is close to the actual; (ii) limit value 
from curve scarcely differed from the-actual since F,(4/,/5) = 0-0015; (iii) wg from curve 
= 1-2237, an error of 0-31 %. 

All the systems (8-8), (8-9) or (8°10) yield probability points which differ very little. For 
instance, in the three cases, the 5 % point is given by 








System 5 % probability 
(8-8) 1-0432 
(8-9) 1-0385 
(8-10) 1-0384 














The practical identity of the latter two is due to the fact that the frequencies were derived 
on very similar hypotheses: it does not mean that the result is more reliable than that, from 
(8-8) which, assuming the accuracy of the calculation of the link ordinates, must be deemed 
to be the most correct and is adopted for the iteration to the n = 7 stage. Nevertheless, these 
experiments convey the hint of general application that if we know (i) a number of moments, 
(ii) the limits of range and the frequency form at the limits of range, and (iii) that the amount 
of frequency-near the limits of range is not negligible, we will probably be in a position to 
estimate with fair accuracy the points of low probability. For this, however, hypothesis (iii) 
is essential: it has no value from the computational point of view if the frequency near the 
limits is negligible. This point is discussed further in § 10. 


9. SAMPLES OF 7 
The functional properties of the curves at the stage are as follows. Let the three links be 
denoted by «, #, y, so that 


a=1//12, P=3//10, y = 5/./6. (9-1) 
Denoting t, by x, set 


y=2-a, z=Yy-Z2, (9-2) 








86 Frequency distribution of ./b, 
and let the curves in the half-zone (0—1/,/12), and in the zones (1/,/12—3/,/10) and 
(3/,/10 — 5/,/6) be denoted respectively by F(x), G(y) and H(z). We then have 
(i) F(0) = 0-6781 = A, | 
(ii) F’(0) = 0, 
(iti) F(a) = G(0) = 0-5870 = B, 
(iv) F’(a) = G'(0), 
(v) F"(a) = @"(0), 
(vi) G(B-—a) = H(y—f) = 0-1838 = C, 
(vii) G’(B—a) = —H'(y-), 
(viii) H(z) = Dzt+c,2z?+¢,2z3 with D = 0-078091. 


> (9-3) 





The central and link ordinates A, B and C at (i), (iii) and (vi), were derived by the Gregory 
formula from (2-14), using intervals of 0-01, 0-025 and 0-05 at different sections of the 
integral range. The equalities in the derivatives at the links are in accordance with order of 
contact requirements (§ 3). The first term on the right of (viii) is from (5-1) with » = 7. 

Conditions (9-3) determine the form of the polynomials: 

F(x) = A+a,.2*+a,24, 
Gly) = B+ (2aga + 4a, 0°) y+ $(2a, + 12a,a*) y* + bay? + bay, (9-4) 
H(z) = Dzt + c,z* + cg2%, 

with x = ty. 

F(z) is taken as an even function of x because it is symmetrical in the zone (—1/,/12 to 
+1/,/12). This should have been done in the case of n = 5; neglect to do so was not serious 
enough to render recalculation necessary. 

The moments used were: 


fig = 1, fg = 0-375, ug = 0-421875, yg = 0-733487. (9-5) 


Using (9-3) in conjunction with 4, and j, (only) in (9-5) the following equations in the six 
unknowns dg, a4, bs, b4, C2, Cc, were found: 









































Left: coefficients of 
Eqn Right: 
eo x absolute 
5 term 
as % bs by Cy C3 
1 |12 1 om aie sai o_ — 13-112496 
2 | 0-816667 | 0-281315 | 0-287507 | 0-189756 on - — 0-403210 
3 —~ _ —_ ‘ie 1:193685 | 1-304172 0-094626 
4 | 1-897366 | 0-756233 | 1-306833 | 1-150028 | 2-185118 | 3-581055 | — 0-122438 
5 0-229605 0-069278 0-047439 0-025048 0-434724 0-356221 — 0-122152 
6 0-130637 0-041871 0-032192 0-017835 0-668438 0-496634 — 0044365 
Approximations to F, G and H were found: 
F,(x) = 0-6781 — 1-2388882? + 1-754160z4, 
G,(y) = 0-5870 — 0-546478y — 0-361808y? + 0-171129y3 + 0-347163y4, (9-6) 


H,(z) = 0-078091z! — 0-0173192? + 0-088408z2°. 








a i en 1 ee. 





id 


AInwWAaoq 





9-6) 


R. C. GEARY 87 

These yielded estimates of the 4th and 6th moments as follows: 
Mg = 0419712, ui, = 0-720776, (97) 
differing by — 0-5 % and — 1-7 % respectively from the correct values at (9-5). These devia- 
tions were not serious from the viewpoint of probability-point determination. Nevertheless, 


it seemed worth while to try to achieve a closer approximation. This was done by finding a 
‘corrector’ ¢(x) (not positive, like a frequency, for all values of x) with the following pro- 
perties: 
(i) total ‘frequency’ zero, 

(ii) ‘2nd moment’ zero, 

(iii) $'(0) = 0, (9-8) 

(iv) dy) = $'(y) = 0, 

(v) ‘4th moment’ = w,— 4 = 0-002163. 
Then (a) = 0-002404 — 0-02885322 + 0-04523223 — 0-02423624 + 0-00434225, (9:9) 
and the frequencies finally adopted are 


O—1//12 ... F(x) = F(x) +¢(z), 
1/,J12—3/10 ... Gly) = cine 
3//10—5/./6 ... H(z) = F,(z)+¢(zx), 

F,, G, and H, being given by (9-6) and x = f,. It is evident from the smallness of the coeffi- 

cients of d(x) in (9-9) that the correction effected by ¢(z) is minute. From (9-10) the moment 


Mg is 0-728972, so that the error is reduced to about one-third of what it was using Ff, G, 
and H,. 


(9-10) 


10. SAMPLES OF 8 
The links and link frequencies are as follows: 





Link Link frequency 
0 0-6927 = A, 

B = 2/J15 0:4442 = B, ei 

y = 2//3 0-1018 = C, | 

6 = 6/./7 0-04019153(d — x)? = D(d—x)?/2, 

where x = t,. 
Set 

y= a— B, 
z=d-2, 
k = y—f = 0-638303, ne 
A = d—y = 1113086. 


The orders of contact (§3) entail the following forms for the three zones: 


Zone 


2 
0—2/,/15: F(x) = A+a,— +0 


Fa x4 ) 
154° 


2 
2/,/15—2/,/3:G(y) = B+ (a.p+a, 5 +05) y+(a+a,6+0,5) 5+ oe +o y* (10:3) 


} f ew z 
2/./3 — 6//7 ‘H(z) = D5 +tsg thas; 


2 “34” 


4 





i 





88 Frequency distribution of ./b, 


Five of the seven equations required to determine the a, b and c will be found from the 
order of contact conditions, as follows: 


(i) B= A+a,e p a 


+45 6G +9094: 
(ii) € = B+(a,p+a,5 +0,5)+(-2+08+a5) 5 tbs ihc 
(iii) C = D+ ope + eqns, 
(iv) a,p+asy tao + (a, +0,6 +05) «+0555 +b— «= —DA- ia. 


2 2 
(v) a, +0,8-+a4 + bgk +b). = D+e,A+e>. 


The remaining two equations were found by equating the 0th and 2nd moments from the 
curves to the true values 1 and 4/11 respectively. The frequency functions found were as 
follows: 
F(x) = 0-6927 —0-320142a? — 2-775123 + 3-0824, ) 
G(y) = 0-4442—0-854177y + 0-308677y? + 0-649680y3 — 0-553667y/4, (10-4) 
H(z) = 0-040192z? + 0-027763z* + 0-008933z2*, 

where x = ts. 

For reasons which will be apparent in the next section, it was not deemed necessary to 
apply higher momental checks in this case. 

Reference may here be made to yet another experiment, the negative result of which 
may have some interest. At the = 6 stage the remarkable ‘regularity’ which the curve 
assumed, after its highly bizarre appearance at the stage before, suggested that orders of 
contact (except at the limit of range) might be ignored at a slightly later stage and a single 
curve fitted using the moments only. 

Using 9, 42, 4, and p,, and the D(d—z)*/2 (see (10-1)) for the forms at the limit of range 
with F}(0) = 0 the following frequency curve was found: 

F,(x) = 0-040192(8 — x)? + 0-132866(3 — x)® — 0-293716(3 — x)# + 0-231146(8—2x)5 
— 0-039279(8 — x)* — 0-005209(8 — x)’. (10-5) 
The correct values of the moments (to 6 places) were 
Mo = 1, fg = 0-363636, px, = 0-414644, pu, = 0-763334, jx, = 1-823617. (10-6) 

The value , of the 8th moment computed from the curve was 1-993270, an error therefore 
of +9-3%. The central ordinate F,(0) = 0-9017 as compared with the actual 0-6927, so 
that the curve F,\x) could not validly be used for further iteration, since the frequencies near 
the central frequency would be considerably in error. The probability (computed from 


(10-5)) for F,(x) beyond the ‘true’ 5 % probability point (computed from (10-3)) is 0-0456 ° 


which is quite accurate enough for practical purposes. This concordance, unexpected in view 
of the other facts mentioned, is due principally to the fact that F,(7) has the correct form at 
the limit of range. This experiment shows that, despite the regularity of the Jb, distribution 
for n = 8, the problem of finding the nearly exact distribution cannot be treated in cavalier 
fashion. 








By tl 
frequ 


Fo 


le 








R. C. GEARY 89 


11. PROBABILITY POINTS FOR FREQUENCIES FOR SAMPLES OF 8 OR MORE 


By the Gram-Charlier theorem for symmetrical distributions under general conditions any 
frequency f(w), where w has mean zero and variance unity, can be expanded in the form 
A, (@\* A, (d\* A (a\* 
£ = Wes TS Pi Mir. TY tot Get. TA! Bbw 
Fa ae Fe rz (a) + eng (a0) + 31a (s.) * ad Ou), 


O(w) = Tom; *P 4w?, 


(11-1) 
where 


the A being semi-invariants of the original variate. Let u be a normal variate with mean 
zero and variance unity. Using the method of E. A. Cornish & R. A. Fisher (1937) their 
expressic rp. for w in terms of u has been extended to the following effect: 

Ag ri Ag A AAs As 
~ 340378 ggang4s— 7200875 + 30720897 1152A8"7 4og2ae7?* 
where the x, are Hermite polynomials in u of the degree indicated. The y; and z, terms in 
(11-2) are as follows: 


w=U4 





~ (11-2) 


%3=—u + 3u, Y,= 9u7?— ee ere 
Y; = 3u5—24u3+29u, 2, u’— 17u5+ 69u— 57u, (11-3) 
x, = —u>+10u3—-l5u, 2,=—u?+ 2lu5—105u> + 105u. 


At (11-2) the expansion is taken to O(n-*) because A,,/A‘ is O(n—*+!) when the A are 
semi-invariants of b, for samples of n. 
The 2,, y; and z, functions at various probability levels are as follows: 



































Probability points 
Function 
0-10 0-05 0-025 0-01 0-001 
u _ 1-281552 1-644854 1-959964 2326348 3-090223 | (11-4) 
2s 1-739867 0-484338 | — 1-649229 | — 5-610905 | — 20-239354 
y, |— 297984 |— 22-98240 | —37-09056 | — 30-28992 | + 226-9286 
az, |— 1-632248 7-789154 16-986942 22-868797 | — 33-058481 
Y> 136-1309 194-9563 | —22-4505 — 675-7597 — 1286-263 
Zy 19-09291 41-19959 27-20047 | — 53-45689 | — 261-9424 
a, |— 19-5234 |— 74-2935 — 88-4883 — 15-5752 362-6625 
For n = 8 the semi-invariants, etc., required are 
A, = 0-363636, A,/AZ= 0°1357,) 
A,= 0-017950, A,/Az = — ‘oie (11-5) 
A, = —0-055836, A,/A$=  2-6577. 
Ag = 0-046470, 





If the formula at (11-2) were quite correct and then if we computed, at any probability 
level €, the value of w, then set x = A}w and from (10-4) computed the probability from end 
of range the result should be exactly ¢, assuming, of course, that (10-4) gives the exact 








= i = 


—_——_ or © 


90 Frequency distribution of ./b, 


frequency distribution. When this procedure is carried out at different pseudo-probability, 
i.e. the probability of x, levels indicated, the following results are found: 
Pseudo-probability 0-10 0-05 0-025 0-01 0-001 
(a) True probability (to «= =i 0-096855 0-050459 0-026825 0-011504 0-001 | (11-6) 
(6) Normal probability (to =A $u) 0:095564  0-052376 0-029502 0-013419 0-001090 


The correspondence at (a) is obviously satisfactory. At first sight it might appear that at 
0-01 and 0-001 levels the divergence is (by the standards of this communication) rather 
marked. Actually this is not the case considering the fantastic difference in the algebraic form 
of the Gram-Charlier and the actual frequencies near the limit of range. The probabilities 
at (b) show that the normal curve gives quite a good representation. At n :=8, however, the 
comparison flatters the normal curve since, as R. A. Fisher (1930) has shown, the ratio 
A,/A3 actually assumes its normal value of 3 at n =7 and reaches its greatest value at n = 22. 

We now propose to take a step which is discussed in some detail in the final section. We 
shall endow the right side o* (11-2) with a remainder term which will make the probability 
of w formally the same as </ie pseudo-probability at (11-6). The following table shows the 
value of the variate t,0—! computed from (10-4) (where x represents ¢,) at different true 
probability levels, together with the corresponding value of w computed from (11-2): 











Probability t,/0 w Rk 
0-10 1-253173 1-273231 — 82-2 (11-7) 
0-05 1-671682 1-666548 21-0 
0-025 2-043181 ~ - 2-008014 143-3 
0-01 2-445125 2-389594 227-5 
0-001 3-107977 3-079696 115-8 














It has been seen that the difference between w as given by (11-2) and the true value is O(n-*). 


Accordingly the values of R were found at the different probabiiity levels by setting 
Le 
wt+s=%, (11°8) 


with n = 8. The estimates of the probability points P for values of n >8 are accordingly 





Ay a3 ag As, R 
P- -A(4 + BY + CR + Dog + B34 Fe east x) (11-9) 
the values of A, B,...,@ and R being given in the following table: 
Prob- 
ability A B C D E F G R 





0-10 1-281552 | — 0-0724945 0-00776-| 0-00227 0-04431 | — 0-01657 0-000484 | — 82-2 
0-05 1-644854 | — 0-0201808 0-05985 | — 0-01082 0-06346 | — 0-03576 0-001843 21-0 
0-025 1-959964 0-0687179 0-09659 | — 0-02357 | — 0-00731 | — 0-02362 0-002195| 143-3 
0-01 2-32€348 0-2337877 0-07888 | —0-03176 | —0-2}997 0-04640 0-000386 | 227-5 
0-001 3-090223 0-8433065 | — 0-59096 0-04591 | — 0-41871 0-22738 | —0-008995| 115-8 



































The terms in the first four columns agree with, or have been derived from Cornish & Fisher 


(1937). The A’s are semi-invariants derivable from the exact values of the moments given 
at (1-2). 








as de 











R. C. GEARY 91 


As a test, the following is a comparison of the 0-05 and 0-01 probability points for n = 25 
as derived by E. S. Pearson (1930) (using a Type VII curve) with the values front (11-9): 








Probability Geary 
level Pearson (11-9) 
0-05 0-711 0-707 
0-01 1-061 1-062 

















0-7 





—— 4/b, frequency (n=3)) | 
-~-=- Normal frequency for $.D.—=1 


* Link 
True standard deviation of /b,=0-5000 


Frequency 














! 
0 0-5 1-0 T5 2-0 25 3-0 35 
Variate (s.pD. = 1) 
Fig. 3. Frequency of /b, for n = 3. 


With standard deviation o = 0-435 it is obvious that the differences are not important. 
Sample number 25 is the lowest for which Pearson computed the probability points, and 
for two levels only. The formulae at (11-9) can probably be accepted with confidence. 


12. CoNcLUSION 
From frequency formulae (5-2), (6-8), (7°25), (8°8), (9°10) (with (9-6) and (9-9)) and (10-4) 
the probability points for ,/b, for normal random samples of n = 3,4,5,6,7 and 8, respec- 
tively, can be determined without difficulty. The six frequency distributions are illustrated 











92 Frequency distribution of ./b, 


in Figs. 3-8. On each of the ./b, frequency curves there is superimposed the normal frequency 
with the same standard deviation, the intention being to enable a contrast to be made 
between the several ,/b, curves by reference each to the normal frequency, and to show the 
fairly rapid approach of the ./b, frequency to normality with increasing n, even for small 
samples.* 

In this research nothing was so remarkable as the transformation which the single step 
in the iteration, namely that from n = 5 to n = 6, effected in the shape of the frequency 
curve. From n = 6 on, the join at the links is effected so smoothly as to be almost imper- 
ceptible to the eye. The eye, however, flatters the actual approach to normality in the /b, 
frequency curves, as measured algebraically by the probability points. 


04 ~ 


















0-3 . 8 _ Nb, frequency (n=4) 
\ --- Normal frequency for $.D.=1 
? Link 
True standard deviation of ./b,~0-5856 
iy - 
5 
= 
z 0-2 
‘\ 
.} 
\ 
\ 
‘ 
\ 
OIF 
\ 
‘ 
. 
‘ 
‘ 
‘ 
N 
. 
Ny 
ps 
ys 
0 igen ae 








i i J. 1 

0 0-5 10 5 20 25 30 +5 
Variate (s.p. = 1) 

Fig. 4. Frequency of ./b, for n = 4. 


It may be well, at this stage, to recapitulate. Using integral iteration formula (2-13) 
(or (2. 14)), frequency ordinates were computed at values of the variate termed the ‘links’ 
at which the frequency is shown to have functional discontinuities. Using the exact values 
of the moments (given at (1-2)), and taking into account the known order of contact (§ 3) of 
the different functions at the links and the known form assumed by the frequency at the 


* As R. A. Fisher (1930) has shown, the approach to normality is not, however, uniform with in- 
creasing n, as indicated, say, by f,. See p. 90 above. 








—_— tone Oe . 2 lUmhOClo 








R. C. GEARY 93 


known limit of range, inter-link frequencies were determined in polynomial form. Attention 
is directed to the use, at-the n = 4 to 7 (inclusive) stages, of the higher moments for the 
purpose of checking the general reliability of the frequency curve (or rather series of curves 
joined at the links). 

Of far greater practical importance, however, are the formulae (11-9) designed for the 
estimation of the 0-10, 0-05, 0-025, 0-01 and 0-001 probability points for normal random 
samples of ,/b, for n > 8. There will be little trouble about finding the corresponding formulae 
for other probability links. What degree of confidence can be reposed in these formulae? 
This raises in an acute form the vexed question (on which the protagonists of different schools 
were prone to get very vexed indeed a generation ago) of how best to use moments (or semi- 
invariants) for estimating frequency distributions. The general problem was constantly 


0-4. 





03 wane far Senene rere for $.D.~ 1 


--— Normal frequency 


_o7 


7? Link 
\ True standard deviation of b,— 0-6124 


Frequency 


O-1- 








1 1 
0 05 F140 15 2-0 12-5 3-0 
Variate (s.p. = 1) 
Fig. 5. Frequency of ./b, for n = 5. 


in the writer’s mind during the present research and he would be glad if his colleagues could 
study the possibilities of the methods which culminated in formulae (11-9) for bridging 
the chasm which still divides the knowledge (sometimes exact) of the lower moments of 
statistics like ./b, and 6, and the formulae (however empirically established) for the frequency, 
in which a measure of confidence can be reposed. This fundamental problem was abandoned 
some years ago in a thoroughly unsatisfactory condition. 

The Karl Pearson approach consists essentially in having regard to the ‘shape* which 
experience has shown that frequency curves tend to assume end to use the first four moments 














94. Frequency distribution of /b, 


for the purpose of determining the constants of the curve. The disadvantage of the Pearson 
method is that of itself it gives no indication as to whether the resulting curve closely follows 
the actual frequency: it is necessary to have recourse to such devices as comparing the curve 
with a frequency distribution determined from hundreds of random sample computations 
of the statistic under examination. Apart from the tediousness of this method it is often 
indecisive in regard just to the parts of the frequency which are of most importance, namely 
the ends, because the small numbers which the check computation throws into these zones 
are usually subject to large (Poisson) errors. 









0-4 ew 
aa . aé 
fry Seppe +09 for S.D.=1 
=== Normal frequency ‘ 
0-3+- \ Link 
\ True standard deviation of Jb, 0-6172 
\ 
> 
g 
~ 
g 
fe, O-2F « 
0O-1- 
i) n n 1 . 1 n ae ae 
0-5 +o Tt 15 2-0 25 





Variate (s.p. = 1) 
Fig. 6. Frequency of ./b, for n = 6. 


The Gram-Charlier system, on the other hand, can only be used with confidence when the 
frequency is fairly close to the normal. In practice the reliability is judged by the con- 
vergence of such terms as one can compute from the moments, i.e. if the successive terms 
show an ‘unmistakable’ tende.icy to diminish one feels confident in the computed frequency. 

Obviously what both the Pearson, the Gram-Charlier and other frequency systems require 
is a Remainder Theorem. Since, however, an infinite number of moments are required to 
define a frequency distribution, with only a few moments known the most that can be 
expected is that upper (or lower) limits of the probability of the statistic can be established as 











arson 
llows 
‘urve 
tions 
often 
mely 
ones 


n the 
con- 
erms 


ney. 
quire 
ad to 
n be 
ed as 








R. C. GEARY 95 


functions of the known moments. This is what Tchebychev’s Theorem, and theorems of the 
type, do. Too much cannot be expected from the knowledge of a few moments: the approx’- 
mations are almost invariably too rough for statistical use, when a high standard of efficiency 
is required; and M. Fréchet (1937) has shown that the Tchebychev type approximations 
are the best, given the assumptions, which can be made. For all their great mathematical 


| importance (incidentally for their justification for the statistician of ‘the faith that is in 


him’), it seems to the writer that research on these lines will not produce formulae which 
will be s*atistically utilizable in general conditions; but he may be quite wrong. 


0-4 


03 







ao_—- ,/b, fi . =7 
fy Pepe Bert) for $.D.=1 
-=-= Normal frequency 
Link y 
True standard deviation of ./b,= 0-6124 


Frequency 
So 
~w 
i 








1 ae, ‘st 
0 T0-5 1-0 st 20 25 x0 8 86T 35 
Variate (s.p. = 1) 
Fig. 7. Frequency of ,/b, for n = 7. 


Knowing the earlier moments the Cornish-Fisher type expression (depending on the 
Gram-Charlier form of frequency) gives, at any probability level, an expansion for the 
variate to a defined order in the sample number. As might be surmised from the coefficients 
of the normal moments (e.g. (1:2) above), the coefficients in powers of n-* in the expansion 
of the variate usually tend to increase rapidly. In the present paper a remainder term of 
suitable order in n has been added to the known terms in the former expansion and its 
coefficient found by reference to the (assumed) exactly known expansion for n = 8. Clearly 
two more terms (in n-> and n-*) respectively could have been found had we iterated the 
frequency to n = 9 and n = 10, respectively, though this was not deemed necessary in the 














96 Frequency distribution of ./b, 


present case. It would appear that in problems analogous to the ,/b, frequency, great preci- 
sion might be obtainable by this method; it is proposed to use it with b,; and its efficacy 
could be judged by iterating the frequency on a few stages further and testing the formula 
(with remainder) by reference to these iterations which were not used to establish the 
remainder.* 

The iteration method is onerous but, with the co-operative effort to which it lends itself, 
is quite practicable and seems to have a range of application which is not confined to problems 
in which universal normality is assumed. Knowledge of the moments is not necessary for 
its application: in the present research the moments constituted an embarras de choix 













0-4 
\ te 
wae lth hake tee 
---— Normal frequency 
0-3 T Link 
True standard deviation of /b,= 0-6030 
> 
° 
fo} 
oO 
= | 
3 0-2} 
cm 
OIF 
0 1 i L i A 
* 0-5 tT 10 r5 72-0 25 3-0 > 





Variate (s.p. = 1) 
Fig. 8. Frequency of ./b, for n = 8. 


which necessitated the solution of linear equations in six or seven unknowns. In other work 
it may suffice to compute, from the iteration frequency, many ordinates and simply to rely 
ON 4g, i.e. the total frequency being unity at each stage. This is what the writer did (1935) in 
establishing the frequency of the test of normality a. 
It is clear that the frequency formulae for ./b, given in this communication cannot be 
regarded as ‘ proved’ in the sense, say, that R. A. Fisher has proved the frequency of normal 
t first given by W.S. Gosset. Even for n = 4 to 8, inclusive, the iteration method yields only 
approximations; the method, however, can be used to attain any desired degree of precision 
* Clearly the effect of the remainder diminishes rapidly with increasing n. 








reci- 
cacy 
nula 

the 


self, 
lems 
y for 
hoix 


vork 
rely 
5) in 


t be 
‘mal 
only 
sion 








R. C. GEARY 97 


by increasing the number of frequencies computed by approximate integration at each stage, 
though this is not to bé recommended to the solitary researcher. Formulae (11-9) for the 
probability points for n> 8 are more empirical and it would be useful to know how they 
fare at, say, n = 12, by comparison with the ‘actual’ frequency found by extension of the 
iteration, should any students be sufficiently interested in making the experiment, the 
results of which incidentally would be a guide in the further use of the method of integral 
iteration here exploited. The present research does not hold out much prospect of exact 
solutions (in the mathematical sense) of the 6, and b, frequency problems since such 
solutions would involve the exact knowledge of about 4n separate sectional frequencies 
in the ./b, case. It does show that empirical solutions can be found accurately enough for 
practical purposes. 


REFERENCES 


CornisH, E. A. & FisHer, R. A. (1937). Revue de l’ Institut international de Statistique, 
5 Année, 4 Livraison, pp. 309-20. 

Fisuer, R. A. (1929). Proc. Lond. Math. Soc. 30, 199-238. 

FisHer, R. A. (1930). Proc. Roy. Soc. A, 130, 16-28. 

Fr&cuHet, M. (1937). Généralités sur les Probabilités. Variables aléatoires. 

Geary, R. C. (1935). Biometrika, 27, 310-32 and 353-55. 

Geary, R. C. (1936). Biometrika, 28, 295-305. 

McKay, A. T. (1933). Biometrika, 25, 204-10. 

Pearson, E. 8. (1930). Biometrika, 22, 239-49. 

Prarson, E. 8. (1931). Biometrika, 22, 423-4. 

Pearson, E. 8. (1936). Biometrika, 28, 306-7. 

PrepreER, J. (1932). Biometrika, 24, 55-64. 


Biometrika 34 











[ 98 ] 


ON THE COMPUTATION OF UNIVERSAL MOMENTS OF TESTS OF 
STATISTICAL NORMALITY DERIVED FROM SAMPLES DRAWN 
AT RANDOM FROM A NORMAL UNIVERSE. APPLICATION TO 
THE CALCULATION OF THE SEVENTH MOMENT OF 6, 


By R. C. GEARY anv J. P. G. WORLLEDGE 


1. IytTRODUCTORY 


The principal object of this communication is to develop a computational technique appro- 
priate to the formula given by one of the authors (Geary, 1933). By way of illustration the 
formula is applied to the computation of the seventh moment of 

m n 

by = J =n > (&— FZ (x, — 2)", (1-1) 

ms i=1 
where 2,,22,... x, are the measures of the random sample of n and of which & is the arith- 
metic mean. Universal normality is assumed throughout. 

A glance at formula (3-9) in which this paper culminates will indicate that the task of 
deriving higher normal moments of b, is not one to be undertaken in a frivolous spirit. The 
work finds its main justification in the conviction of the authors that accurate (if not exact) 
values of the probability points of b, can be found in terms of the moments of b, for aii values 
of n using a method which has proved successful in the case of the analogous test of asym- 


metry, involvin 
. : by = 78 = mb a — 2) E(B) (1-2) 


In turn, the importance of the determination of accurate probabilities for ,/b, and 6, for 
normal samples derives from the facts revealed by unpublished work by one of the authors. 
This shows (1) that probabilistic inferences drawn from the well-known significance tests 
based on the assumption of universal normality are apt to go astray when, in fact, the universe 
is not normal, and (2) that ./b, and b, provide the most efficient tests of asymmetry and 
kurtosis, respectively, in indefinitely large samples, amongst wide fields of alternative tests 
and of alternative non-normal universes. 

R. A. Fisher (1930) has given the exact values of the second, fourth and sixth moments of 
/6, and J. Pepper (1932) the eighth moment. In the former paper R. A. Fisher also gave the 
values of the second and third moments of b,. The moment field was extended by J. Wishart 
(1930) and in a joint paper by R. A. Fisher & J. Wishart (1931). C. T. Hsu & D. N. Lawley 
(1940) gave the fifth and sixth moments of b,. All these authors used the combinatorial 
method due to R. A. Fisher (1929). The present approach is entirely different. 


2, ‘THE FUNDAMENTAL RELATION 
To make the exposé complete it may be useful to reproduce the relevant part (which is quite 


brief) of the 1933 paper. The method used is due essentially to C. C. Craig (1928), applied to 
the normal case. Using, in the usual notation, a prefixed E to indicate ‘expected’ or, more 














R. C. Geary anv J. P. G. WORLLEDGE 99 
























accurately, ‘average value for all samples’, we have for the characteristic function of the 
2,=2,—-% (t= 1,2,...,2), (2-1) 
n 
the expression E exp| pi “| ; (2-2) 
i=1 


where the ¢; are n parameters, so that 


Exp Zp... z, 
where the a; are any p positive integers, is the coefficient of 
th te... tp 
in the expansion of 
a,!a,!...a,! Hexp {t,(%,—%) +t(z7,.—Z) +... +ty(%, —Z)}. (2-3) 


The exponent can be written in the form 
a(t, —t) + %a(t—t) +... + Xq(t, —2), 


n 
where i = > t,/n. Since the 2; are independent, 
i=1 


Eexp 22z,(t,;—7) = [| Hexpz;,(t,—2). (2-4) 
i=1 
Assuming, as we may without loss of generality, that the normal universe of the x; has 
mean zero and unit standard deviation, we have 
Eexpz;,(t,;—t) = exp 4(t;-7)*. 
® 
Hence a,!a_!...a,! Hexp > t,(4,;—%) = a,!a,!...a,!exp}2 (t,—7)*. (2-5) 
i=1 
By definition, the power f of a term is given by f = >a, and the dimension by p. It is clear 
i 
that the required universal mean value of 
E(x, —%)™ ... (%,—%)% 


will be found as the coefficient of /7t#i in the expansion of 





a,!a,!...a,! (t; +t.+...+t,)?\* 
1'%: 2 e Urs Pp 6) 
at pws (A+8+...+8 = , (2-6) 


where 2k =f 


3. THE COMPUTATIONAL SCHEME 


The computational scheme, which is quite general, will most clearly be outlined by reference 
to the computation of the exact value of a specific moment (from origin) 4,(m,), for the deriva- 
tion of which it was primarily designed. Then 


7! 


ary (244) 


fiz(m,) = * E(A+2d+...+2)? = he MRSC. 


Ban 7! 
Wis Mo 7! 
+ pp Bl 16-84) + spear BU 12" 4) + gyorg Bl 12-8 2)| 





100 Co;nputation of universal moments of tests of statistical normality 
7 7! 7! 


+n(n—1)(n—2) (m3) rag BC 16-49) + or ap BU 12-844) + ope BUS 4) 
+n(n—1) (n—2) (n—3) (n= 4) |e B 12-49) + ag BO =) 

+n(n—1) (n—2) (n—3) (n—4) (n—8) = B(848) 

+n(n—1)(n—2) (n—3) (n—4) (8-5) (n— 8) 7, BH) |, (3-1) 


where, for example, 
E(-12-84*) = EzPaae = E(x, —%)* (w,—Z)* (vg—Z)* (2, —2)*. 
There are, accordingly, fifteen terms made up of one of dimension one, three of dimension 
two, four of dimension three, etc. The structure of the numerical coefficients will be noted: 
in particular that, when the power of a fectorial appears in the denominator, its factorial 


a'so appears. Each of the fifteen E terms will be evaluated separately, grouped by dimensions 
and multiplied by the n-factors. 


As already stated, the value of H(a, a, a, ...) will be found as the coefficient of 
tf: tf tf... 

@,!a,! a5! 
arate 2+ (Zt), (3-2) 
‘with vy = —1/n. In this case, of course, f = 2a, = 28. 

Expand (3-2) in powers of v by the binomial theorem. Each of the v power terms will, in 
general, make a numerical contribution to the value of E(a,a, ...) which will, accordingly, 
be represented by a polynomial in v of degree 14. The term in y* will be 


in the expansion of 


a,!a,!... 14!p® ' 1 
14! 214 sil4—ayit4— ibn 2a, — 28,)!8,!(a,— 28)! s,!...° 


In the Z,, summation extends to all non-negative integer series s,, 8, ...,80 that Ss, = (14—8), 


8, being associated with a,, s, with a,, etc. The values which the s; can assume are obviously 
restricted further by the condition that 








(3-3) 


a; > 28;. 


Let the series (Zp, 21, 2», ...) be termed the reciprocal factorial vector (hereafter usually written 

‘r.f.v.’) of a1, ag, ..., the terms of the vector being regarded as of the order indicated by the 
subscript. The walle will be‘indicated by clarendon type. From the computational point of 
view the following relation is fundamental: 


AxB = AB, (3-4) 


where A = (a,a,...) and B = (6,5,...), and any other r.f.v. The multiplication sign at (3-4) 
is defined as follows: the terms of A are multiplied respectively by v®, v', v?, etc., and added to 
give a scalar A; the terms of B in the reverse order are also multiplied respectively by 
v°, yt, v?,... and summed to give B. The coefficients of y®, v!, v?,... in the product (in the 
ordinary sense) A B give the vector AB. Relation (3-4) is immediately evident from the form 
of 2, in (3-3). From this relation it is quite easy to build up r.f.v.’s from those of lower order 
44 from 4, 84 from 8 and 4, 88444 from 8 and 8444, or 88 and 444, etc, 




















101 


Having found all fifteen r.f.v.’s the second step in the computational process is to form the 
scalar product of each r.f.v. and (2s)! v*/s!—the latter will be termed the v-multipliers which, 
it is important to note, are the same for all the terms in (3-1)—which, from (3-3), gives 
E(a,a, ...) divided by 


R. C. Geary ann J. P. G. WorRLLEDGE 


a,!a,!.../2™. 
The latter are multiplied by the numerical factors in (3-1) to give what are termed the 
constant multipliers, ‘constant’ in the sense that they are the same for all the v power terms 
in each of the E’s in (3-1), but these constant terms are different for the different H’s. For 
example, the constant multiplier for the term Ez?2242 is 
12!8!4!?7! 
3!2!1!272! 216° 
Note the ‘absolute constants’ 7! and 2", and that the powers of the term appear as factorials 
in the numerator and factorials one-fourth of these powers in the denominator. In the 
denominator is also a 2! which is the factorial of the factorial power. 
The third step in the computation is to sum the terms of the same dimensions. The final 


step consists in the multiplication of the terms of the different dimensions by the v-factors 
as follows: 


(3-5) 


Table 1. v-factors 





Dimension v-factors 





ys 

~ (+8) , 
2y* + By5 + vA (3-6) 
~ (64 + 115 + 6y4 + v3) 

24y* + 5Ov5 + 35y4 + 10v3 + vy? 

— (1208 + 274y5 + 225y4 + 85y* + 152 + v) 

720v° + 1764y5 + 1624y4 + 735y8 + 1752 + 21y 41 


soar wh = 














The v-factors at (3-6) are, of course, the n-factors in (3-1) with vy = —1/n. 

To deal with the very large whole numbers and their reciprocals which arise in factorial 
computation we had recourse to a prime number index notation. For this purpose the number 
is factorized into powers of the lower primes—we have used the notation for primes not 
exceeding 31. Thus 

746,137,199,808,000 = 6847-131-111-72- 5-35. 29 
is written in this notation 6847[112359], 


the digits in the square brackets [ ] being the powers of the lowest primes arranged in ascending 
order from the right. The ordinary number 6847 will be known as the coefficient and the 
symbolical number in square brackets as the primal of the original number. Note that in 
this example the notation affects an economy from 15 to 10 in the number of digits 
required to describe the number. Should the original number not be factorizable by a 
particular small prime a 0 will be inserted in the proper place, e.g. [10358] means that 7 is 
not a factor of the number represented. If, as often happens with the first two primes, the 
indices exceed 9, decimal points are used, e.g. [124-11-17] means that the original number 
has 2!? and 3" as factors. The primal notation can be used when the indices are all positive 








102 Computation of universal moments of tests of statistical normality 


or all negative: occasionally, however, + and — signs have to be mixed in the primal (see 
Table 6). 

With little practice great facility is acquired in applying the ordinary rules to numbers 
in primal notation. For multiplication or division corresponding digits in the primals are 
added or subtracted, the coefficients being dealt with in the ordinary way. In addition or 
subtraction common factors in the primals are immediately evident and the coefficient of 
the sum (or difference) is derived usually by a single product-sum (or product-difference) 
operation on a multiplying machine. It may be observed that all the work for this paper 
was executed without inconvenience on small hand multiplying machines with capacity 
9x 8x 13. 

In the following tables the first thirty-two factorials, the v-multipliers and the constant 
multipliers required for the computation of 4;(m,) are expressed in primal notation. 


Table 2. Factorials in primal notation 





Ol=1!= [0] 17! = [111236- 15] 
2!= (1] 18! = [111238-16] 
3!= [li] 19! = [1111238- 16] 
4!= [13] 20! = [1111248-18] 
5!= [113] 21!= [1111349-18] 
6!= [124] 22! = [1112349- 19] 
M= [1124] 23! = [11112349- 19] 
8! = [1127] 24!= [1111234-10-22] 
91= [1147]. 25!= [1111236-10-22] 
10! = [1248] 26!= [1112236-10-23] 
ll!= [11248] 27!= [1112236-13-23] 
12!= [1125-10] 28!= [1112246-13-25] 
13!= [11125- 10] 29! = [11112246-13-25] 
14!= [11225-11] 30!= [11112247-14-26] 
15! = [11236-11]} 31! = [111112247- 14-26] 
16! = [11236 - 15] 32! = (111112247: 14-31} 


Table 3. v-Multipliers in factorial and primal notation 


yo ; 20110! 
vo; 221i! = [1111225-11] 
v2; 24112!-1 = [11111225-12] 
vis; 26113!-1 = [11111245- 13] 
vid; 98114!-? = [11111248-14] 


(1111124-10] 


Term in Coefficient 
e : Or = [0] 
fe = Ste (1) 
py? 4! 2-= [12] 
vs 6! 3’ = [113] 
v4 8! 41? = (1114] 
vy 10! 6° = [1135] 
vs 12! 6! = [11136] 
v? 14! 7i?'= [111137] 
vo: 16! 8!f2= [111248] 
Y : 1! 9r'= (1111249] 








see 








R. C. GEARY AND J. P. G. WoRLLEDGE 


Table 4. Constant multipliers in factorial and primal notation 


Required for 
computation of the 
undermentioned 

term in (3-1) 
E(28) >: 28!7!/7!214 = [(1112246- 13-11] 
E(24-4) :  2414171/611!2"4 = [1111244-11-11] 
E(20-8) : 20!8!7!1/5!212" = [111145-11-11) 
E(16-12) =: :16!12!73!/4!3!214 = [1246-11-11] 
E(20- 4?) >: 20!14127!/5!1!22!2%4 = [111134-11-10] 
E(16-84) =: 16!8!4!7!/4!12!1!2"4 = [1145-10-11] 
E(12? 4) > 12194171/3!22!11214 = [235- 11-10] 
E(12- 8?) > 1218!°71/3!12122121 = (145-10-10] 
E(16- 4°) > 16!4!97!1/4111931214 = [1134-9-10] 
E(12-84?) : 12!8!4!1271/3!2!1!221!2'4 = [134-10-10] 
E(8* 4) >: 8194171/2!3311!2" = [44-8-10] 
E(12-44) :)—-12141871/31 11441214 = [12398] 
E(8? 43) : S!241971/212211133124 = [3389] 
E(845) : 8141571/21115 51214 = [2188] 
E(4") : 47/171" = [77] 


The theory will be illustrated by reference to the computation of Ezfz$z4zfz8 = E(8* 48). 
First the r.f.v. 88444 is found as the product 884 x 44 by setting down in equal spaces the 
terms of 884 and on a movable slip spaced to the former the terms of 44 in reverse: 


All primals are negative 





884 Fo] SB7]| 109 Bs] | ttt G37 |e03 [i12-10 i493 [14-1] 389 [i229 | 119 fi24-1i) 1543 (22545) | 31[22515]| [228-17] 
































oent sie bd ha 7d 0) | 44 


























The term in 88444 from the position illustrated is that of the 5th order, namely, 
5[4-13]+ 109[4- 12] + 111-7[14- 10] + 803[112- 11] + 1493[114- 12] 
= [114-13] (5-7-5+4 109-7-5-2+4777-7-8+ 803-9-4+4 1493-2) 
= [114-13] (3-27737) = 27737[113- 13]. 


The manner of computation is indicated: first the largest (negative) digits in each of the four 
positions of the primals are underlined and the underlined set is regarded as the common 
factor. Note how, at the final stage, the factor 3 of the coefficient reduces the primal digit 
from 4 to 3. From the entries in the round brackets ( ) it will be clear that, as stated above, 
the procedure is well adapted to the multiplying machine. The full calculation of 88444 is 
shown in Table 5. 

The identity of the r.f.v.’s from the two factorizations of 88444 constitutes an absolute 
check of the work. The calculation of H(8? 4°) required for (3-1) is completed in Table 6. 
In practice the figures in columns (4) and (5) of this table were derived from those in column 
(3), and in Tables 4 and 5 by entering the latter on two movable slips and folding opposite 
each entry, as required. This stage of the work was rapidly executed. The sum-product of 
columns (1), (2) and (5) give the value of H(8? 4%). All the r.f.v.’s required for the calculation 
of the E’s for (3-1) are given in the appendix. 





104 Computation of universal moments of tests of statistical normality 


Table 5. Calculation of reciprocal factorial vector 88444 
All primals are negative 


: 1493[116- 16] + 389[123- 15] + 119[25- 14] + 1543[225- 16] + 31[225-17] 
10 : 389[124-18]+ 119[125- 14] + 1543[126- 18} + 31[225- 16] + [225-19] 

11 : 119[126-17}+ 1543[226- 18] + 217[226- 18] + [225-18] 

12 : 1543[227-21]+31[226-18]+[126-20] 

13 : 31[227-21]+[226-20] 


547889[ 226-17] 
151331[226- 19] 
127[223- 18] 
2329[227- 21] 


Order r.f.v. of 88444 
(i) By 884 x 44 
0 : [29} zs [29] 
1 : [2-8]+5[29] = 7[29] 
2 : 7[3-10])+ 5[28]+ 109[3- 11] = 9[-11] 
3 : [3-10]+35[3-10]+ 109[3-10]+ 111[139] = 947[13-10] 
4 : [4-13]+5[3-10]+ 763[4-12]+ 111[138] + 803[112- 12] =  1811[110-13] 
5 : 5[4-13]+109[4-12]+777[14- 10] + 803[112-11]+ 1493[114- 12] = 27737[113-13] 
6 : 109[5-15]+ 111[14- 10] + 5621[113- 13] + 1493[114-11]+389[122-14] = 1783141[125-15] 
7 : 111[15-13]+ 803[113- 13] + 1493[15-13]+ 389[122-13]+119[124-13] =  20627[115-13] 
8 : 803[114-16)+ 1493[115-13]+ 389[23- 15] + 119[124-12]+ 1543[225-17] = 1772417[225-17] 
9 = 
= 37([227-21] 
44 : [227-23] = [227-23] 
(ii) By 8844 x 4 
: [29] [29] 
: [29]+[18] 7[29] 
: [3-11]+[18]+85[3-10] 9[-11] 
: [2-10]+85[3-10]+ 169[12- 10] 947[13-10] 


: 85[4-12]+ 169[12- 10] + 11113[104- 13]. 


0 

1 

2 

3 

4 1811[110- 13] 
5 : 169[13-12]+11113[104-13]+5137[114- 11] 

5 

7 

x 

9 


27737[113-13] 
1783141[125- 15] 
20627[115- 13] 
1772417[ 225-17] 
547889[226- 17] 
151331[226- 19] 
127[223- 18} 
2329[ 227-21] 
37[227- 21} 
[227-23] 


: 11113[105- 15] + 5137[114- 11] + 22703[124- 13] 
: 5137[115- 13] + 22703[124- 13] + 9341[125- 13] 
: 22703[125- 15] + 9341[125- 13] + 90541[225- 17] 
: 9341[126- 15] + 90541[225- 17] + 2453[225- 16] 

10 : 90541[226-19]+ 2453[225-16]+ 137[126- 18] 

11 : 2453[226-18]+ 137[126-18]+ 17[226- 18] 

12 : 137[127-20]+ 17[226-18]+[226-21) 

13 : 17[227-20]+[226-21] 

14 : [227-23] 


ao tet oe ee Se eo 


Finally, the E’s are multiplied by the appropriate v-factors given in Table 1, to give the 
value of E(mj). Now R. A. Fisher (1930) (see also Geary, 1933) has shown that 


H(b2) = E(b3) = E(mj)/E (m3 (3-7) 
and EX(m}4) = (n—1) (n +1) (n +3)... (n+ 23) (n + 25)/n"4. (3-8) 
Finally, 4,(b,) = (37n"3 + 211-37n!2 + 64,802 - 36n!2 + 13,154,290-35n!° 
+ 668,584,331 - 35n® + 25,489,306,481 - 35n8 + 74,020,784,452-7-35n7 
— 72,634,851 ,124-7- 5-38 + 407,081,273,655-7-5-3%n® 
— 1,287,510,783,723-7-5- 34 + 2,526,463,322,982-7-5- 3% 
— 280,521,238,122-11-7-5-3%n? + 3,036,544,767-13-11-7-5?- 38 


— 135,393,525-13-11-72523%)/(n +1) (n+3)...(m+23)(m+25). (3-9) 











Anir 
(3-9) 
fairly 
relat 


mor¢ 
man 
is as 
not | 





le 





SS 


R. C. Geary anv J. P. G. WoRLLEDGE 105 


4. CORROBORATION OF FORMULAE 


An integral part of the present work is the technique of check. To be of value the formulae at 
(3-9) and (3-10) must be absolutely correct because (1) any errors made in factorial work are 
fairly certain to be large and (2) the formulae are designed for use when n is small, when 
relatively small errors in the numerical coefficients may materially affect the results. Further- 
more, it is almost impossible to avoid error (even in a joint work like the present) with so 
many individual calculations involving numbers astronomically large. As will appear, there 
is a satisfactory, though not absolute, check at the final stage; but if it reveals error it does 
not show where the error occurred, so that, if this were the sole check, there would be no 


Table 6. Calculation of E(8*4*) from 88444 





























r.f.v. 88444 

Term (3) x v-multiplier (4) x constant 

(1) (Table 3) multiplier [3389] 
Coefficient Primal (neg.) (4) (5) 
(2) (3) 
ye 1 [29] [-—2-9] [3360] 
pl 7 [29] [—2-8] [3361] 
p2 9 (-1) [1-9] [3390] 
ys 947 [13-103 ([-—2-7] [3362] 
yt 1811 [110-13] [1-9] [3390] 
ys 27737 [113-13] {-8] [3381] 
ye 1783141 [125-15] [lo—1—2-9] [13260] 
y? 20627 [115-13] [1100 — 2— 6] [113363] 
ys 1772417 [225-17] {1l—10-—1-9] [112370] 
vy? 547889 [226-17] {111—10-—2-8] {1112361} 
pio 151331 [226-19] ([1111—10-—2-9] (11112360) 
pil 127 [223-18] [1111002 — 7] [111133- 10-2] 
plz 2329 [227-21] [1111100 —2-—9] [111113360] 
pis 37 [227-21] [1111102—2~—8] [111213561] 
pl I [227-23] [11111021 —9] [111213590] 
bs 1 





alternative but to face the tedium of complete recalculation. It is essential to devise an 
absolute check at each stage. This has been done for the present technique. 

The first check is the n = 1 (or vy = —1) check. This is applicable to the H’s (see (3-1)) of 
dimension one, two and three. It derives from the fact that 


I 14 \14 
[zq- = (24 = {a+ rq+ ay Ete, ; (4:1) 


i>j 
It will be immediately evident from the latter that when v = — 1 the following terms vanish 
identically: 
(i) all terms of one dimension; 

(ii) all terms of two dimensions except those of the type f*#* with which we are not 
concerned; 

(iii) all terms of three or four dimensions in which the highest power exceeds 14: the latter 
being the highest power which, say, t, can assume in the expansion of 


(a> t; ty) 14 yl4 





106 Computation of universal moments of tests of statistical normality 


Even in terms of three dimensions in which the highest power is less than 14, e.g. in Hz}*z}? 2, 
the v = —1 test can be exploited. In fact, from (2-5) and (4-1) the required terms for 
v = —1 are 

14! 12!24! 7! 
0! 2!2 3122! 14! 21 





H(12?-4) = 5 2i4 — 12111124], 


14! 12!8!? 7! 


E(12-8*) = Saaigraal atom 





24 = 12![3115]. 


The sum of these two terms is 131[1236- 14] which should be the sum of the four Z terms of 
dimension three in (3-1) since #(20-4?) and H(16-84) are zero (for vy = —1). The checks 
specified in this paragraph were fully applied to the terms of one, two and three dimensions 
in (3-1) before multiplication by the v-factors (Table 1). 

Reciprocal factorial vectors for dimensions exceeding two were checked fully by the 
‘double’ factorization technique exemplified in Table 5. In view of the simplicity of the 
two subsequent processes, namely those of the v- and constant multipliers, this check may 
be taken as establishing the accuracy of the H’s of dimension three or more. Reference may 
nevertheless be made to a check at this stage, namely that the ratios of consecutive coeffi- 
cients in each £ exhibit a marked regularity, if correct. Any irregularity (which in the nature 
of the work will usually be large) must be suspect. 

Assuming the accuracy of the H’s in (3-1) the final stage was checked by multiplying by 

he v-factors (3-6) in two ways: 
(i) by straight multiplice.cion using the primal notation; 

(ii) by taking (in (3-1)) 


= (1+) (1+ 2v)... (1+ 6v)A,+v(1+v)... (14+ 5v)4g+...+°A, 
= (1+)... (1+ 5v) {(1+ 6v) A, +vA,}+..., 
and computing in successive stages 
B, = (1+ 6v)A,+vA,, B, = (1+5v) B,+v?Ag, etc. 


The results were the same. 


A satisfactory check for the final stage is that of n = 4. A. T. McKay (1933) has, in fact, 


given a formula for this value of n from which the seventh moment from zero of b, is found 
to be 


82,220,810,251/5711-13-17-23-29 


which value also transpired on substituting 4 for n in (3-9). This establishes the accuracy 
of all the formula except possibly the part accruing from the terms in (3-1) in 


n(n—1)...(n—4), n(n—1)...(n—5) and n(n—1)... (n—6) 


which vanish when n = 4. 


If it adds nothing to the vheck in the previous paragraph it is nevertheless of interest to 
observe that, for n = 3, the value of the seventh moment of b, is found to be (3)? which is as 














for 


act, 
ind 


acy 


t to 
3 as 








R. C. Geary AND .J. P. G. WoRLLEDGE 107 


it should be since, in this case, each b, assumes the constant value $, whether the samples 
are normal or not. 

A partial check is also afforded at the final stage by the vanishing of all coefficients of 
powers of v from vy! to v®° inclusive. 


5. CONCLUSION 


Previous investigators in this field have all used the combinatorial technique, invented by 
R. A. Fisher (1929) and applied in the first instance to the cumulants, which are linear 
functions of the sample moments. The present writers have not had sufficient experience in 
working the Fisher technique to decide which method is easier to apply. It is quite likely 
that the Fisher method is shorter. A strong point of the present computational scheme is 
that it lends itself to check at every stage; and the method may appeal to students who 
prefer the algebraical or arithmetical to the geometrical approach. For their benefit, and 
also in case it may later be found necessary (in connexion with the accurate determination 
of the probability points of b, for samples of all sizes) to compute higher moments than the 
seventh—it is almost certain the seventh will be required—we give as an appendix an 
extended series of reciprocal factorial vectors. From these can be derived without difficulty 
(i) corresponding E’s, e.g. E(x, —%)* (x,—Z)* (x,—Z)*, on multiplication by appropriate v- 
and constant multipliers and (ii) r.f.v.’s of higher powers. 


APPENDIX 


A selection of reciprocal factorial vectors required for the calculation of moments of b, for normal 
samples, including all used for the calculation of the seventh moment, in primal notation 


All primals are negative 


Order 4 8 42 12 84 
0 {1} [13] [2] [124] [14] 
1 [1] [12] [1] [114] [4] 
2 [13] [14] 7[13] [26] 31[26] 
3 [124] [13] [135] 7[115] 
4 (1127] [26] [1128] 127[1128] 
5 [1148] 17[1138] 
6 [1125-10] [113-10] 

43 16 12-4 Re 842 
0 [3] {1127] [125] [26] [15] 
1 3[3] [1125] [123] [24] [13] 
2 13[5] [137] 13[135] 5[26] 17[25] 
3 3[4] [237] [35]. 31[136] 53[125] 
4 13[17] [113-10] 47(1138} 323[1139] 2497[1138] 
5 {17] [1259] 29[ 1247] [1028] 173[1137] 
6 [39] (1125-11) 157[11259] 43[124- 10] 7[139] 
7 (11225-11] [11249] [124-10] [1049] 
8 [11236-15] [1126-13] [224-14] (114-13] 












108 


Order 44 20 
0 [4] [1248] 
1 [2] {1148} 
2 19[14] [113-10] 
3 5[4] [124-8] 
4 49[17] [124-11] 
5 5[ 16] [135-11] 
6 19[38] [1? 26-13] 
7 [38] [11226- 12] 
8 [4-12] [11236-16] 
9 [111238 - 16] 
10 [1111248-18] 
824 843 
0 [27] [16] 
1 5(27] 5(16] 
2 109[39] 13[8] 
3 111[137] 143[126] 
4 803[112-10] 1399[1119] 
5 1493[114-10] 239[1029] 
6 389[122- 12] 6943[114-11] 
7 119[124-11] 65[104- 10] 
8 1543[225- 15] 277[114- 14] 
9 31[225-15] 23[115-14] 
10 [225-17] [115-16] 
ll 
12 
16-8 28 
0 [.13-10] (11225-11] 
1 [1129] (11125-11] 
2 13[104-11] [1126-13] 
3 43[114-11] [1136-12] 
4 3823[224-14] [236-15] 
5 1507[226- 12] [238-15] 
6  4933[1136-14] [1237-17] 
7 28943[11236-14] [11337-15] 
8  4331[1237-18] [11248-19] 
9 79[11235- 17] [111249-19] 
10 43[11237-19] [1111249-21] 
11 37[11348-19] [111234-10-20] 
12 [11348-22] [1111234- 10-23] 
13 [1112236- 10-23] 
14 [1112246-13 


+25) 


16-4 


[1128] 
[1028] 

11[13- 10} 
47[1238] 
253[124- 11] 
89[125- 11] 
51{1115- 13] 
1277[11226- 12] 
229[11234- 16] 
[11135- 16] 
[11237-18] 


45 


[5] 

5[5] 
125[17] 
35[15] 
545[28] 
23[8] 
545[3- 10] 
35[39] 
125[4-13] 
5[4-13] 
[5-15] 


12? 


[248] 
[237] 

23[249] 

47[ 259] 
289[124- 12] 
593[136- 10] 
8531[1137- 12] 
193[1136- 12] 
929[ 1236-16] 
113[1238-15] 
37[1248-17] 
[1249-17] 
[224- 10-20] 


12-8 


[137] 
[37} 

31[139] 
7[227] 
299[123-10] 
73[115- 10] 
3713[1126- 12] 
83[1126-11] 
1181[1236- 15] 
47[1237-15] 
[1237-17] 


24 


[1125-10] 
[11249] 

[125-11] 
[126-11] 
[224-14] 
[236-12] 
[1137-14] 
[11236-14] 
[11237-18] 
[111239-17] 
[1111248-19] 
[1112349- 19] 
[11112384- 10-22] 


83 


[39] 

[28] 

[-10-] 
47[12-10] 
89[101-13] 
281[ 113-11] 
2833[124-13] 
11[121-13] 
2213[224-17] 
2089[ 236-16] 
71[235-18] 
[235-18] 
[336-21] 


Computation of universal moments of tests of statistical normality 


12-4? 


[126] 

[26] 
101[138] 
19[136] 
1163[1149] 
53[239] 
629[1105- 11] 
479[1125-10] 
773[1126- 14] 
13[1126- 14] 
[1127-16] 


20-4 


[1249] 
[1238] 
53[125-10] 
31[125-10] 
23(124-13] 
13[135-11] 
151[1136- 13] 
733[11236-13] 
1153[11237-17] 
587[111238- 16] 
67[1101247- 18] 
[1111049- 18] 
[1111249- 21) 


8742 


[28] 

[17] 

85[39] 
169[129] 
11113[104- 12] 
5137[114- 10] 
22703[124- 12] 
9341[125-12] 
90541[225- 16] 
453[225- 15] 
137[126- 17] 
17[226- 17] 
f222-90) 








Order 


DaNaaAr Wwe Oo 


Caeatanrwde © 








Order 


ss 
cwewm"oarwnde SO 


— 
— 


— 
- & bo 


— tt 
PRPwWNDe CHOMHAIDBASWONW HK SO 


Ce ee 
PWwWHeEOHTWMWAIGCAP WHE SO 


R. C. GEary AND J. P. G. WoRLLEDGE 


24-4 


[1125-11] 
[1025-11] 
139[1126-13] 
47[1126- 12} 

79[226- 15] 

31[137- 15] 
1051[1237 - 17] 
79[11236- 15] 
73[11138- 19] 
59[101239- 19] 
5611[1111249-21] 
8011[111234- 10-20] 
1597[1111224- 10-23] 
47[1111234- 10-23] 
[1111234- 11-25] 


16-84 


[113-11] 
[13-11] 

29[14-13] 

37[113- 12] 
52139[225- 15] 
9893[216- 15] 
299297[1136- 17] 
2511043[11237- 15] 
360131[11235- 19] 
173497[11237- 19] 
589[11208-21] 
29437[11348- 20] 
3307[11348- 23] 
[10249- 23] 
[11349- 25] 


12-84? 


[139] 
7[139] 
227[14-11] 
257[23- 10] 
97523[125- 13] 
245[5- 13] 
61219[1106- 15] 
9479[ 1026-13] 
12738433[ 1237-17] 
46223[1136-17] 
53593([238- 19] 
1937[1228- 18] 
1571[1238- 21] 
53[1239-21] 
[1239-23] 


20-8 


[125-11] 
[25-11] 

19[124- 13] 
331[136- 12] 
7001[236- 15] 
1489[236- 15] 
25513[1237- 17] 
197[11127-15] 
30211[11247- 19] 
168713[111249- 19] 
310841[1111249-21] 
127[1111336- 20] 
4013[111134- 10-23] 
109[111135- 10-23] 
[111135- 10-25] 


16-48 


[112-10] 
[12-10] 
211[113- 12] 

_ 659[123-11) 
3671[123-14] 
2699[115-14] 
29593[1115- 16] 
106361[11215- 14] 
624511[11135- 18] 
884393[11236- 18] 
14113[10236- 20] 
2771[11138- 19] 
2579[11238- 22] 
23[11238- 23} 
[11239-24] 


12-44 


[128] 

7[128] 

47[3-10] 

35[39] 
319[110- 12] 
69971[124- 12] 
3151259[1125- 14] 
34637[1115- 12] 
1202651[1126- 16] 
115769[1126- 16] 
9841[1125- 18] 
1823[1127-17] 
157[1028- 20] 
[1117-20] 
[1129-22] 


20-4? 


[124- 
[24- 

179[125- 
87[125- 
151[26- 
1501[136- 
1609[1036- 
27487[11236- 
13043[11137- 
209509[111238- 
2197[1101244- 
63653[1111249- 
1873[1111248- 
101[111124- 10- 
[111124-10- 


1274 


10) 
10) 
12] 
11) 
14] 
14] 
16] 
14] 
18] 
18] 
20) 
19} 
22) 
22) 
24] 


[249] 
7[249] 


211[25- 
119[25- 
3821[125- 
18667[136- 
490283[1137- 
193[1133- 
441773[1238- 
24799[1238- 
2647[1148- 
677[1249- 
2707[224- 10- 
23[224- 10- 
[224-11- 


894 


[3- 

7[3- 
235[4- 
281[13- 
4177[112- 
643[111- 
98797[124- 
907[114- 
37151[215- 
. 228503[236- 
50333([236- 
391[137- 
1163[336- 
[325- 
[337- 


11] 
10] 
13] 
13] 
15] 
13] 
17] 
17] 
19] 
18] 
21] 
21] 
23] 


10] 
10) 
12] 
11] 
14] 
14] 
16] 
14] 
18] 
18] 
20] 
19] 
22] 
22] 
24) 


109 


16-12 


[124-11] 
[24-11] 
187[125- 13] 
379[135- 12] 
2819[234- 15] 
17161[237- 15] 
104507[1237 - 17] 
30317[11237- 15] 
431099[11248- 19] 
17177[11248- 19] 
5513[11249-21] 
1019[1134- 10-20] 
991[1234- 10-23] 
31[1235- 10-23] 
[1235-11-25] 


12-8? 


[14-10] 
7[14- 10] 
73[14-12] 
667([25- 11] 
22697[125-14] 
4679[ 116-14] 
3257831[1137- 16] 
22177[1127-14] 
374281[1236-18] 
204907[1238- 18] 
4783[1138- 20] 
2251[1248- 19} 
1277[1339- 22] 
61[1349- 22] 
[1349-24] 


8248 


[29] 

729] 

[+11] 
947[13-10] 
1811[110-13] 
27737[113- 13] 
1783141[125- 15] 
206275115: 13] 
1772417[225- 17] 
547889[226- 17] 
151331(226- 19] 
127[223- 18] 
2329[227-21] 
37[227-21] 
[227 - 23] 











110 


Order 


canon r ON & SO 





Computation of universal moments of tests of statistical normality 


845 4 
[18] [7] 
7[18] 7[7] 
251[2- 10) 259[19] 
1051[12-9] 77[8] 
182843[113- 12] 2107[1-11] 
24391[103- 12] 1603{1- 11] 
127741[104- 14} 29771[3- 13] 
1313[4- 12] 2609[3- 11] 
423697[115- 16] 29771[4- 15] 
54827[115- 16] 1603[3- 15] 
11455[106- 18] 2107[4- 17] 
47[6-17] 77[4- 16] 
95[106- 20] 259([6- 19] 
29[117-20] 7[6- 19] 
(117-22) [7°21] 

REFERENCES 


Crata, C. C. (1928). Metron, 7, 3. 

FisHer, R. A. (1929). Proc. Lond. Math. Soc. (2), 30, 199. 

FisHER, R. A. (1930). Proc. Roy. Soc. A, 130, 16. 

FisHer, R. A. & WisHart, J. (1931). Proc. Lond. Math. Soc. (2), 33, 195. 
Geary, R. C. (1933). Biometrika, 25, 184. 

Hsu, C. T. & Lawtey, D. N. (1949). Biometrika, 31, 238. 

McKay, A. T. (1933). Biometrika, 25, 411. 

PEPPER, J. (1932). Biometrika, 24, 55. 

WisHart, J. (1930). Biometrika, 22, 224. 





1. In 
populati 
greatest 

For ce 
The dist 
calculati 
& Pears 
Pearson 
by mear 
forit. L 
tabulate 

As po 
normali 
more pe 
perhaps 
to inves 
present 


(cf. e.g. 
values 
integra 


The ob 
more a 
grating 

The: 
and Gi 


From 

task m 

nm and: 
* Pr 


T Pl 
distrib 








[ 111 ] 


THE ASYMPTOTICAL DISTRIBUTION OF RANGE 
IN SAMPLES FROM A NORMAL POPULATION 


By G. ELFVING, Helsingfors 


1. Introductory. Consider a sample of n observations, taken from an infinite normal 
population with the mean 0 and the standard deviation 1. Let a be the smallest and b the 
greatest of the observed values. Then w = b—a is the range of the sample. 

For certain statistical purposes knowledge of the sampling distribution of range is needed. 
The distribution function, however, involves a rather complicated integral, whose exact 
calculation is, for n > 2, impossible. Tippett (1925), E. S. Pearson (1926, 1932) and McKay 
& Pearson (1933) have studied and calculated the mean, the standard deviation and the 
Pearson constants f,, £, of the range. Fitting appropriate Pearson curves to the distribution 
by means of these parameters, Pearson (1932) has computed approximate percentage points 
for it. Later on, Hartley (1942) and Hartley & Pearson (1942) have, by numerical integration, 
tabulated the distribution function for n = 2,..., 20. 

As pointed out by Pearson, the distribution of range is very sensitive to departures from 
normality in the tails of the parental distribution. The effect of such departures becoming 
more perceptible for increasing , the practical importance of the range distribution is, 
perhaps, small for large samples. Nevertheless, it seems to be at least of theoretical interest 
to investigate the asymptotical distribution of range for n->0o. This is the purpose of the 
present paper.* The results are summarized in a theorem at the end of the inquiry. 


2. The exact distribution. Transformations. The joint-frequency function of the extremes 
a, b reads, as well known, 

wfun(4, 6) = n(n — 1) h(a) (6) [P(6) — D(a)" (2-1) 

(cf. e.g. Cramér, 1945, p. 370). Let u = 4(a+ b) denote the arithmetical mean of the extreme 

values of the sample. Making in (2-1) the transformation a = u—}w, b = u+4w and 

integrating with respect to w, we find for the frequency function of the range the expression 


fav) = n(n— 1)" Plu Jo) Glut Jo) [Olu + Jur) — Pu JuoyFdu. (22) 


The object of our inquiry is the limiting form of the distribution (2-2). It proves, however, 
more advantageous to pass to the limit in the joint distribution of a, b or u, w, before inte- 
grating with respect to w. 

The asymptotical distribution of a and b has been investigated by Fisher & Tippett (1928), 
and Gumbel (1936) (cf. also Cramér, 1945, p. 376). According to these authors, we have 


E(u) = 0, D(u) = O(log n), 





; log log n (2-3) 

E(w) = 2.,/(2logn +f Ss), D(w) = O(log-* n). 

From the formulae quoted it is seen that u->0, W->oo in probability as n->0o. Uur first 

task must, consequently, be a transformation of the variables a, b—or u, w—depending on 

n and intended to stabilize the probability mass, in order to provide a limiting distribution. 
* Prof. H. Wold has kindly directed my attention to this problem. 


+ (x) denotes the distribution function and ¢(x)=®’(x) the frequency function of the normal 
distribution with mean at x=0 and unit standard deviation. 








112 Asymptotical distribution of range in samples from a normal population 


Following the example of the authors mentioned above, we should have to introduce the 
new variables a’ = n®(a), b’ = n®(—b). 


For our purpose it proves, however, advantageous to subject a’ and b’ to a new transforma- 
tion, independent of n, taking 


xe¥ = 2nD(a) = 2nO(—4w+u), (2-4) 
xe~¥ = 2n®(—b) = a as 
Conversely, 
x = 2n./[D(a) D( — b)] = 2n,/[O(-—4w+u)O(-—}w-u)], 
ve) n so iv+) | (2-5) 
7 bloga) blo 8 O(—}w-u)’ 


As a<b and thus Naish 1, it follows from (2-4), that x, y are subjected to the 
restrictions 











x20, xcoshy<n. (2-6) 
Performing the transformation, we find . 
(a, b) x 
- P 2-7 
A(e,y)| ~ 2n*G(a) $6) _ 
and thus, letting f,,(7. y) denote the Rtn function of x, y, 
n—2 

faleey) = "E*a(1 =U (2-8) 


This formula is valid in the region ae sie 8 of it, we have to put f,,(z, y) = 0. 
The new variables x, y depend, of course, on u as well as w. It will, however, be shown later, 
that x, for large n, tends to coincide with thé variable 
x* = 2n®D(— }w), 
which depends exclusively on w. For testing purposes, the former variable may thus, in 
large samples, be used as a substitute for the range. These considerations justify the trans- 
formation (2-4) as well as a closer study of the distribution of x and its limiting form. 
3. Limit passage and remainder term. The limiting form of the joint-frequency function 
(2-8) is immediately seen to be 
fix,y) = fue-zoomy (e>0). (3+1) 
The integral of this function, taken over the whole half-plane z 2 0, is easily seen to equal 1; 
(3-1) is, consequently, the frequency function of a well-determined two-dimensional dis- 
tribution. 
Let the marginal distribution functions in x, corresponding to (2-8) and (3-1), be denoted 
F(x) and F(x) respectively. Our next task will be to estimate the remainder 
| F(x) — F(x) |, which is, obviously, at most equal to the integral 


= i) .3 2 | fal.) —SE,n) | dda. (3-2) 
0J0 


To begin with, we estimate the quotient f,/f upwards. By differentiation with respect 
to the variable z = xcosh y, this quotient is found to attain the maximum value 


n—2 
(1-2) (1-2)"*e = 144+0(3) 
n n n n 


for z = 2. We thus find, for example, 


—<l+ = (n2 5). (3-3) 








For’ 
into ar 


secure 


choose 


simpli 
is fulfi 


hence 


In 


We 


by 


for tl 


Tk 
for b 








G. ELFVING 113 


For the further estimations, it proves necessary to divide the domain of integration in (3-2) 
into an interior and an exterior part by means of a convenient abscissa 7 = y. In order to 


secure the Maclaurin expansion of log (1 - ne cosh 1) within the interior region, we have to 


choose y so as to satisfy the inequality aS 


<k with an appropriate k <1. Taking, for 
simplicity, k = 1—./} and observing that cosh y < e”, we see that the condition mentioned 


is fulfilled if 
es" (1-4). (3-4) 


Now we may estimate f,,/f downwards in the interior domain of integration. Expanding 


log (1 —* Eosh 1), we find 





log’? = log (1-7) += - cosh — —? £2 cosh? y (1 - 9S)" (O0<#<1). (3-5) 


According to the determination of y, the remainder factor is seen to be <2 for <2, y Sy. 
1 
For n23, we have log (1 -*) >- nh Omitting, further, the positive term in (3-5) and 


2n 
replacing n— 2 by n, we find 





InlE,0) _ L>Io frlf; ” 5 _ 3+? cosh* 7 
fe”) S fen «sits 
hence, combining with (3-3), 
ee |< koe 1 (Esa, <y;n2z5). (3-6) 


In the exterior domain of integration, (3-3) directly yields 


| fn(E,0)-F(E,9)|<f(E.9) (ES2,92y). (3-7) 
We proceed to the estimation of the integral (3-2), denoting its interior and exterior part 
by J, and J, respectively. For the former we have, according to (3-6), the inequality 


| x 
i,= [. My, jf 1 | 2fd& dy <3 | [(86-+ 8 cosh? 9) e-sm#d dr, (3-8) 
| 0oJ0 
for the latter, according to (3-7), 
1,= J | “2 f,-f|dédn < } a “Ge keoaha dE diy. sets 
OJy OJy 


The integration with respect to may be explicitly performed. We have, in fact, putting 
for brevity cosh y = a, 


[ose dg = 5 {l—e-[1+az}}, (3-10) 


: Ee dE = — : S{i-ee| 1 tans SP | =” |}. (3-11) 


In order to deduce remainder formulas for (a) roderate, (6) small x, we omit in (3-10) and 
(3-11), (a) all the negative terms, (6) the terms with x? and 2°. According to the Maclaurin 
expansion 

emt = 1+ an + epee 2) an 





(0<%<1), 


Biometrika 34 8 








114 Asymptotical distribution of range in samples from a normal population 


the expression in curled brackets in (3-10) is at most equal to 4a*z*. Inserting these 
estimations in (3-8), we obtain for the interior integral the inequalities 


15 (" dy 15 15 





<= on | o ca ag in’ (3-12a) 
<p "dn < dy (3-126) 
For the exterior integral, (3-10) yields 
ft ee —2y . 


Finally, we have to join the results (3-12) and (3-13). Combining, first, (3-12a) with (3-13) 
and determining e~¥ from (3-4) (taken with the equality sign), we obtain, after some slight 
simplifications in the numerical coefficients, 


8 3a 
A,< 2 ag (n= 5). (3-144) 
Combining, on the other hand, (3-12) with (3-13), we find 
4x2 
4,< ew 2e-2v, 


This expression attains, for fixed x and n, its minimum when y = log. For n= 12, this 


value of y also satisfies (3-4), and we obtain,-as a parallel estimate to (3-14a), 
A <= (log +4) (n= 12). (3-145) 
The formulas (3-i4a, 6) are both valid for all positive x and all n = 12. 


4. The asymptotical distribution. Having established the limiting distribution of the 
variable x defined in (2-5), we are going to examine its properties. 
The frequency function of the distribution considered reads, according to (3-1), 


co) co —at 
f(x) = 2 e-zeoshy dy = i, Tenn (4-1) 


Changing the order of integration, we easily find the distribution function, the mean and 
the variance of (4-1) to be 


ae 1+xcoshy | ani < o 1+2t a i 
F(x) =] i ~cosh?y ° Ydy = 1-[ @ J(®—1) é dt (4 1 ) 
E(x) =4n, D(x) = 4—4n?. (4-2) 


The numerical evaluation of the distribution is much simplified by the fact that f(x) as 
weil as F(z) is closely connected with certain Bessel re Denote 


—xcoshy . 
(x) = Q e dy = [ag ea pn (4:3) 
By differentiation and partial integretion, this function is il to satisfy the differential 
equation 1 

$" (2) +—$'(z)— $x) = 0. (4-4) 








Chang 


hence 
In 

funct 

subst 


Perfc 


Nov 


hen 


For 


wh 








G. ELFVING 115 


Changing x into —iz, we obtain for the function y(x) = ¢( —ix) the equation 


Ue) += We) + We) = 0; (44’) 


hence, ¥(x) is a Bessel function of order zero. 
In order to specify this function, we will deduce an asymptotical expression for the 


function (4-3), valid for large x. For this purpose, we make in the latter integral (4-3) the 
substitution t = 1+ u/2z and write 


u\-* u 
load = t-F 1). 
( +x) ar (0<#<1) 
Performing the integration, we obtain 


wor (veo). «s 


which shows that the Bessel function y(x) = ¢( — ix) tends to zero for x > +700. This function 
is, consequently, proportional to the Hankel function H{»(x) (cf. Jahnke-Emde, 1909, p. 94). 


Comparing the asymptotical expressions of ¢(x) and iH{(ix), we find the proportional 
factor to be 47, whence 


f(x) = 2 HiMin). (4:6) 


We proceed to the calculation of F(x). Every integral of 2H (2x) is (cf. Jahnke-Emde, 
p. 165) of the form xH{*(x) + Const., where H {"(x) is the first order Hankel function corre- 
sponding to H{”(x); consequently, 


F(z) = > Hix) +C. 


Now % Hiiz) tends to zero as —(}7x)'e-* for x->oo (ef. Jahnke-Emde, 1909, p. 101); 


hence C = 1 and 


F(z) =1 -2| ‘ 5 Hie) |. (4-7) 
For small x, F(x) has the expansion ‘ 
2 1\2* 2 5\2x 
F(x) = (log = +5) 5 + (los + 3) Tet (4:8) 
where log = = 0-11593.... (4-9) 


The factors of x in (4-6) and (4-7) are tabulated in Jahnke-Emde (1909, pp. 135-6). 
Below, we give a short table of f(x) and F(x). The corresponding curves are seen in Fig. 1. 


5. Connexion between the variable x and the range. We now turn back to the original 
object of our inquiry: the asymptotical distribution of the range. 


Consider the variable x = 2n,[®(—}w+u)®(-—4w-u)] (5-1) 
introduced in (2-4). As mentioned earlier, 
w>o, u>0 in probability (n> 00). (5-2) 
Under such circumstances, for large n, x may be expected to behave substantially as the 
variable x* = 2n@(—4w), (53) 
which depends exclusively on the range. 


8-2 








116 Asymptotical distribution of range in samples from a normal population 


We shall now prove that x*/x— 1 in probability as noo. According to the well-known 
asymptotic formula 


— o } 
+8 oa (1-3) (27>0); 0(<#<1), 


we may, for | u| < $w, write 


vo am(1 -=) {1+ 0[(4w—|u|)-*}. 









































w2 
x f(x) F(x) x f(x) F(z) 
0-0 0-0000 0-0000 1-5 0-3207 0-5839 
0-1 0-2427 0-0146 2-0 0-2278 0-7202 
0-2 0-3505 0-0448 2-5 0-1559 0-8153 
0-3 0-4118 0-0832 3-0 0-1042 0-8795 
0-4 0-4458 0-1262 4-0 0-:0446 0-9501 
0-5 0-4622 0-1718 5-0 0-0185 0-9798 
0-6 0-4665 0-2183 6-0 0-0075 0-9919 
0-7 0-4624 9-2648 7-0 0-0030 0-9968 
0-8 0-4522 0-3106 8-0 0-0012 0-9988 
0-9 0-4380 0-3552 9-0 0-0005 0-9995 
1-0 0-4210 0-3981 10-0 0-0002 0-9998 
10 
0-5 
0-0 4 L \ fT 
0 1 2 3 4 5 
Fig. 1 


Given an arbitrary ¢€>0, we obviously may find two positive numbers u, and w, (>w,) 
such that x* 
| 
x 








<e if w2w,, |uls<u,. (5-4) 


On account of (5-2), we may, on the other hand, choose n, so that the probability of the 
simultaneous validity of the latter inequalities in (5-4) exceeds 1—e ifn=n,. Consequently, 


Pea 





<¢| >l-e (n2n,), (5-5) 


which proves our statement, 














G. ELFvine 117 


As shown in section 3, the distribution function F,(x) of x converges to F(x) as n->0o. 
Since F(0) = 0, it follows from (5-5), by a well-known method of argument, that the dis- 
tribution function F'*(x) of x* converges to the same limiting function. The asymptotical 
distribution of the range, suitably transformed, is hereby established. 

For practical purposes, it would, of course, be desirable to possess a reasonably accurate 
estimate of the remainder F*(x) — F(x), or at least an estimate of the difference F%(x) — F(z), 
to be combined with the results (3-14). 

For n = 20, the accuracy of F(z) as substitute for F*(x) may be checked by means of 
Hartley’s (1942) tables. The discrepancy amounts to about 0-004 for z-= 0-1, 0-025 for z = 1 
and 0-010 for x = 4. 

The theoretical evaluation of F*(x) — F(x) seems to be somewhat complicated and, besides, 
of little use since x*, for most purposes, may be replaced by x. A few remarks concerning 
the relations between x, x* and their distribution functions will, however, be added below. 

To begin with, we note that always x < x*, the equality sign being valid only if u = 0. 
Consider, in fact, the function x(w), defined by (5:1) for a fixed w. Inserting for Pits analytical 
expression, we easily find that D® log x(w) <0 for all wu. Hence, x(u) has no minimum and 
at most one maximum, and the latter is, by symmetry, seen to be attained for u = 0, being 
thus equal to x*. 

From x <x*, it follows that F*(x)<F,(zx) for all x. We will show that the difference 
F(x) — F(x) may be expressed as a double integral. 

The variables u and w are, according to (2-4), well-determined B,(@,) 
functions of x and y in the region (2-6); and so is the variable x*, sada 
on account of (5-3). , 

On the level curve x* = x», w has a constant value wo, determin 
by 2nD( — }w,) = ro, (5-6) 


and this curve is, consequently, given in parametric form by the 
equations 


a = 2n./[O(—4w,+u) D(—4wy—u)], y= Liggett) (5-7) 4G) Ta 


D(— }wy—u)’ 
where wu runs through all values from —oo to +00. The latter 
function (5-7) being, obviously, monotonously increasing, we may 
imagine u eliminated, writing (5-7) in the form 








is En(Xo,¥) (—c<y<oo). (5-7’) a 
64 
From the proof of the inequality x <x* given above, it follows n ©) 
that the function (5-7’) has a single maximum for y = 0. When 
y > +0, the function obviously tends to zero. Fig. 2 


The inequality x* < 2, is fulfilled on the left side of the curve (5-7’), the inequality x < x, 
on the left side of the straight line x = 2. Let us for brevity denote the regions (cf. fig. 2) 
OSxS£,(%,Y), §,(%:5Y)<KS%X% (5-8) 

by A,(x) and B,(x) respectively. The difference F,(%))— Fif(%9) is, then, the probability 
of the points x, y falling within the region B, (2). Dropping the indices 0, we thus obtain 


the expression sought for 
Fy(e)— Fra) = [[ falGnd dae. (5-9) 











118 Asymptotical distribution of range in samples from a normal population 


Comparing, finally, the transformed range distribution function F*(x) directly with its 
limiting form F(x), we find 


F(x) — F(z) = ([F, (2) — F(z) -(F,@) — Fue) 


= [ees aean— ff, Indédy 
= | } ates” ~f)dedn -| er df dy. (5-10) 


The former integral is, obviously, at most equal to the remainder expression A,, in (3-2), 
estimated in (3-14). 


6. Conclusion. Our main results may be summarized in the following theorem: 


THEOREM. Consider a sample of n observations from an infinite normal population with 
mean 0 and standard deviation 1. Let a be the smallest, b the greatest of the observed values, 
and put 


x = 2ny['M(a)(—b)}, x* = 2no(-P=*), 


the latter variable being evidently a simple transformation of the range of the sample. Then 
(1) x<x*; x*/x-— 1 in probability (n> 00). 
(2) The distribution functions F(z) and F*(x) of x and x* tend, for noo, to the 
common limit © ligt a 
= _ A = —e- (1)( 4. 
F(x) =1 [ ® ie 1° dt = 1+ 9 Hf (ix), 


4 
where H {"(z) is the first order Bessel function, which vanishes as — (5) e# for z>+100. 


(3) For n2 12, F, (x) satisfies the inequalities 


| F,,(a) — F(x) | <- *(1+%2), | F,(e)— Fle) |< (log + 3). 


7. Generalization. A great part of our conclusions does not presuppose the normality 
of the parental population. Thus, the distribution (2-8) of the variables x, y defined by (2-5) 
is the same for any continuous probability law and so, consequently, is its limiting form; 
however, if the parental distribution is non-symmetrical, with distribution function G(x), 
say, the factor ®(—b) in (2-5) must, of course, be replaced by 1 — G(b) instead of G(—b), 
and the variable x* is to be defined by 


x* = 2n {G(— }w) [1 — G(dw))}. 


The proof of the statement x*/x-—1 requires, however, convenient assumptions con- 
cerning the parental distribution. It can be proved that the assertion mentioned—and, 
consequently, the theorem stated above—are valid if the frequency function of this dis- 


tribution is of the form 1 
g(x) = Cexp| — = | x |. 


where l1<p<2. 





its 


10) 


3-2), 


vith 
ues, 


lity 
(2-5) 
rm ; 
H(x), 


-b), 


con- 
and, 
dis- 





G. ELFVING 119 


REFERENCES 


Cramer, H. (1945). Mathematical Methods of Statistics. Uppsala. 

FisHER, R. A. & Tippett, L. H. C. (1928). Limiting forms of the frequency distribution of the largest 
or smallest member of a sample. Proc. Camb. Phil. Soc. 24, 180. 

GuMBEL, E. J. (1936). Les valeurs extrémes des distribution statistiques. Ann. Inst. Poincaré, 5, 
115-58. 

Hart ey, H. O..(1942). The range in random samples. Biometrika, 32, 334-48. 

Harr.ey, H. O. & Pearson, E. 8. (1942). The probability integral of the range in samples of n observa- 
tions from a normal population. Biometrika, 32, 301-10. 

JAHNKEE, E. & Empk, F. (1909). Funktionentafeln. Leipzig and Berlin. 

McKay, A. T. & Pearson, E. S. (1933). A note on the distribution of range in samples of n. Biometrika, 
25, 415-20. 

Pearson, E. 8. (1926). A further note on the distribution of range in samples, taken from a normal 
population. Biometrika, 18, 173-94. 

Prearson, E. S. (1932). The percentage limits for the distribution of range in samples from a normal 
population. Biometrika, 24, 404-17. 


Tippett, L. H. C. (1925). On the extreme individuals and the range of samples taken from a normal 
population. Biometrika, 17, 364-87. 








[ 120 ] 


LIMITS OF THE RATIO OF MEAN RANGE TO 
STANDARD DEVIATION * 


By R. L. PLACKETT, B.A. 


The ratio of mean range @, in samples of n to population standard deviation a, which 
has been denoted by d,,, is used in control chart work (when the population is assumed 
normal) to estimate o from the ranges of a set of small samples. On comparing the series of 
values of d,, for different n when the parent population is rectangular with the series when 
it is normal (see table below), it is clear that for n < 12 the two series agree to within less 
than 10%. With this in mind, the question arises: what are the limiting values of d,, for 


a given n? It is shown here that populations exist for which d,, is arbitrarily near to zero, 
while for no population will d,, exceed the value 


om (ers piten- 2)!—[(n (n—1)15). 


We consider a population whose distribution function is F(x) and which extends from 
—a to +a so that F(—a) = 0 and F(a) = 1. The population in the first place may have 


any finite limits, but there is no loss in generality in supposing these. It is required to find 
limits to the ratio 





ar [1- Fn—(1— FyJde (1) 
; _ 
w[fser-( a) 


We apply the calculus of variations and find the extremes of d,, in the class of functions F 
such that F(—a) = 0 and F(a) = 1; the case is thus one of fixed end-points. Suppose that 
F(x) = u(x) gives an extreme value and form the functions F(x) = u(x) + tv(x); for t suitably 
near to zero, all these will be permissible distribution functions, i.e. monotonically increasing, 
provided v( —a) = v(a) = 0. Then for ¢ = 0, d/dt(d,,) is zero for all functions v(z). 


iB [1 —(w+tv)"—(1—u—tv)"|dz 


a ) 3” 
at x*(u' + tv’) dx — ag x(u’ + tv’) da y | 


an |" x*u'dx — (|’ ow'de) || [" (1—w)""! vdx — 4 wrod | 


-[f. (wud || f x®v'das —2 a? ae 
aff” a?u’da — ( i awae) | 


a a 
Now } x*v'dx = — 2 xvdz since v(a) = v(—a@) = 0, and by the same condition 
—a —a 


} av'dx = -{" vdzx. 


* Communication from the National Physical Laboratory. 


Since 





ll 


La) | 





= 0, 





ich 
1ed 
3; of 
1en 
less 
for 





R. L. PLAackEetTtT 121 


a . 
The numerator now becomes of the form ) 8(x) v(x) dx, and this must be zero for all 
—a 


functions v(x); it is therefore concluded that s(x) is identically equal to zero. In fact 


n kx x?u'dx — iy vu'de) | [(1 —u)"-?-—u"-1] 
-[ fi 0-w-a-wnae][ "wana, 


so that if ~ is the mean, o the standard deviation, #, the mean range in samples of n, and 
F(x) the distribution function of the population which gives an extreme value to d,,, we have 


w,(e— ps) = no Fe — (1 — Fy], 
Put xz = —a and obtain no*® = W, (+a); x = a gives no® = wW,(a—) whence uw = 0 and 
x =a[F"_—(1— F)-]. (2) 


This distribution must give an upper limit to d,, since if we consider a distribution of the 
type below: 











Area (1 — 2y) F(x)=(24+1)y —4<2<0 
| Area y Area y F(x)=1-(1l-—2u)y U<a<} 
; v 
—% 0 +4 y<t 


the ratio (1) for y = O(n-*) is approximately ,/(3/2)./y which can be made as small as we 
please. 


Reverting therefore to equation (2) we note that since aw, = no*, d,,(max.) = no/a. 


a "a 2 
ot = i} 2d F — (| rdF) 
= a2 z (Fe igi (1 dil F)j7? dF. 


2 
r 2/q2 — — = 
Therefore o?/a? = a 2B(n, n), 


i.e. d,,(max.) = OA Corson ae 2)!—[(n— 135) ; (3) 


It is of interest to note that all the foregoing analysis may be carried out with a equal to 
any finite value and so we may take the limit as a>0o, and equation (3), which is independent 
of a, will still hold. 


It is easy to verify, by Stirling’s formula or otherwise, that as n increases [(n—1)!]* 
becomes negligible compared with (2n — 2)!. 


Consequently, for large n, d,,(max.)=n ,/{2/(2n —1)} 


=| ("+3=778) 


i.e. d,,(max.)=,/(n + 4). (4) 
The probability density function of (2) is obtained by differentiation and is 


1 
f(x) = a(n—1)[F*-?+ (1— F)"]’ (5) 














122 Limits of the ratio of mean range to standard deviation 


so that (2) and (5) are the parametric equations of the curve in terms of its distribution 
Qu-3 


function. Thus forn > 2, f(0) = aay f(+a) = roeat The distributions (2) are readily 


seen to be unimodal and symmetrical about x = 0. For n = 2,3 they are rectangular. For 
: 1 me ; : 
F >} and large n, aF"-"!=x. Hence Petes, f(x) = ———.... Similar considerations for 


x(n —1) 
F <} show that for large n and x #0, 


] 
faa (n— 1)" 

From (4), ¢~aj//n. Consequently, for any finite a, as noo the distributions (2) tend to 
a single ordinate at x = 0. This should be compared with the limiting case giving d, > 0 
for fixed n illustrated with the diagram above. The limiting form of the two distributions is 
the same but the approach to the limit with increasing n is quite different. There is no 
approach to normality. 

Following is a table of d,(max.) end of d, in samples from normal and rectangular popula- 
tions for n = 2, ..., 12. The quantity /(n+4) is also included to see how closely (4) is 


approximated. The values of d, (normal) are obtained from the paper by E. 8S. Pearson 
(1942). For a rectangular distribution d, is simply 2 ./3(n — 1)/(n + 1). 


























n vin + 4) ’ d,, (inax.) : d, (normal) |d, (rectangular) 
——— = —— | — Gees aA: — — 
2 1-58114 1-15470 | 1-128 1-15470 
3 1-87083 173205 3!) ~=«61-693) =| —s«-73205 
4 2-12132 2-08395 2-059 | 2-07846 
5 2-34521 2-34013 2-326 2-30940 
6 2-54951 2-55333 2-534 2-47436 
7 2-73861 2-74414 2-704 2-59808 
8 2-91548 2-92076 2-847 2-69430 
9 3-08221 3-08685 2-970 2°77128 
10 3-24037 3-24440 3-078 2-83426 
11 3-39116 3-39466 3-173 2-88675 
12 3-53553 3-53860 3-258 293116 
| 

















Some values of d, for a number of symmetrical populations were given by Pearson 
& Adyanthaya (1928) and have been reproduced with some figures for one skew population 
in Tables for Statisticians and Biometricians, Part I1, Table X XIII. The majority of these 
values were obtained empirically from random sampling experiments. These values were 
of course subject to sampling error and for this reason are in three cases very slightly 
above d,(max.). 


Some of the preceding work was done as part of the Research and Development programme 
of the Ministry of Supply (S.R. 17) and appears by permission of the Chief Scientific Officer. 
It was completed as part of the research programme of the National Physical Laboratory, 
and this paper is published by permission of the Director of the Laboratory. 


REFERENCES 


Pearson, E. S. (1942). The probability integral of the range in samples of n observations from 
a normal population. Biometrika, 32, 301-10. 

Pearson, E. 8S. & ApyanTHaya, N. K. (1928). The distribution of frequency constants in small samples 
from symmetrical populations. Biometrika, 20 A, 356-60. 





‘son 
tion 
nese 
vere 
htly 


ime 
cer. 


ory, 


rom 


ples 





[ 123 ] 


SIGNIFICANCE TESTS FOR 2x2 TABLES 
By G. A. BARNARD, Imperial College 


Part I 


The theory of statistical significance tests deals with abstractions of experimental results. 
The fact that the figures dealt with may happen to be tensile strengths of iron bars, or 
perhaps weights of babies, is ignored in the carrying out of the test; and for the purpose of 
statistical theory the experiment in question could just as well be represented by an expe “i- 
ment involving the drawing of balls from urns. In fact, it is an advantage, from some points 
of view, to replace the concrete experiment involved in a particular practical case by an 
‘abstract’ urn-experiment, in order to retain in view only those features of the case which 
can be dealt with by statistical methods. 

It is obvious enough that the first step in the statistical treatment of an experimental 
result may be represented as the replacement of the concrete experiment by an ‘urn- 
experiment’; but the implications of this have not always had the continuous attention they 
deserve. Once the abstract picture has been formed, the analysis of it is largely a matter 
of pure mathematics. What distinguishes the statistician from the pure mathematician, in 
this connexion, should be the statistician’s ability to form valid abstract pictures of concrete 
cases, and his clear recognition of the limits of validity of his abstract pictures. Yet we find 
relatively .ittle discussion in statistical text-books of the process of formation of these 
abstract pictures. 

It is the purpose of the first part of this paper to draw attention to the confusion which 
may arise through the possible formation of several different abstract pictures, each of 
which may apply to some concrete cases, though not to others. 

Suppose we are given two mass-production processes, A and B, and we wish to test 
whether process A and process B are equally satisfactory, in the sense that neither process 
is more likely to produce defective items than the other. For this purpose we take, say, 
m articles made by process A, and n made by process B, and test them, under suitable con- 
ditions. We find that a out of the m articles are defective, while b out of the n articles are 
defective, a result which can be represented in the form of a 2 x 2 table (Table 1). 


Table 1 





I (defective) II (non-defective) Total 





Process A a 
Process B b 
Total r 


™m 


Ao 


n 
N 


i) 




















The statistical analysis of results of this type has been much discussed, but it seems to 
have escaped notice that, on the facts incompletely stated as above, it is possible to form 
several different abstract pictures, any one of which might be appropriate to the real case 
in question. The adoption of one picture rather than ancther will depend, in a given case, 
on further knowledge which is not specified above. 











124 Significance, tests for 2 x 2 tables 


The basis of Fisher’s ‘exact’ test 

The current generally accepted test for results of the above type is that given by Fisher 
(1941), or some approximation to it. The simplest abstract picture * to which this test corre- 
sponds would seem to be one in which the m articles made by process A and the n articles 
made by process B are represented by N similar balls, m of them marked A and n marked B. 
The N balls are put into an urn, and then withdrawn in random order. As they are withdrawn, 
the balls are placed, in order, in a row of N receptacles, r of which have been marked ‘I’, 
the remainder being marked ‘II’. The result of Table 1 then represents the observation that 


a of the balls marked A are in receptacles marked ‘I’. The probability of such a result, in 


such an experiment is m!nir!s! 


Nia! dteld! () 
which can be seen by considering that the contents of the r receptacles marked ‘I’ form a 
sample of r from an urn containing m balls marked A and n balls marked B, the sampling 
being done without replacement. The probability (1), added to those of all results less 
probable than that obtained, is the basis of Fisher’s test. 

In the concrete case given, the N balls, initially similar, may be taken to correspond with 
the N items of raw materials. The process of labelling the balls A and B corresponds to the 
selection of m of the items of raw material, and their fabrication into articles by process A, 
and the fabrication of the n remaining ones by process B. The N receptacles into which the 
balls are eventually placed then represent the N ‘test occasions’ which must be provided 
for when the experiment is laid out. The fact that these receptacles are labelled ‘I’ or ‘II’ 
before the balls are placed in them corresponds to the assumption of the hypothesis being 
tested—that the processes do not differ in respect of liability to defectives, so that whether 
or not a given article is defective has nothing to do with whether it is A or B. The labelling 
‘I’ or ‘II’ is thus assumed independent of the labelling of the balls. Finally, the random 
allocation of balls to receptacles corresponds to a precaution which might have been taken in 
the concrete case, viz. the random order of test of the article secured by the use of random 
numbers or the like. 

The basis of the C.S.M. test 


Another abstract picture, also applicable to the concrete case as incompletely described 
above, forms the basis of the test to be developed in the later part of this paper, which we 
have called the C.S.M. test. In this picture, the two processes A and B, arerepresented by two 
urns, A and B, each urn containing a large number of balls, some of which are marked ‘1’, 
while the others are marked ‘II’. The selection for test of m articles of process A is represented 
by the random drawing of m balls from urn A; and similarly for the n articles of process B. 
The test procedure corresponds to the examination of the balls, to see whether they are 
marked ‘I’ or ‘II’. The liability of process _A to produce defectives is represented by the 
proportion p, of balls marked ‘I’ in urn A, while p, similarly represents the liability of 


process B. The hypothesis we wish to test says that p, = p, = p, say. The probability of a 
result such as that of Table 1 is very nearly 


m! n! 
alicia! — Pal’ x pigi Pe — Py)" (2) 


* Though not the only possible one. By following Fisher’s argument, as given in his book, one can 
construct @ more complicated picture which leads to a similar result. 














8 can 














G. A. BARNARD 


which, on the hypothesis tested, becomes 


Im! 
qi — BY. (3) 


We may notice that the expression (3) differs from (1) by a factor 


ris! 


and it would have been obtained in the earlier case if we had assumed that the labelling of 
the receptacles was itself done randomly, by selection of N labels from a box containing a 
large number of labels, the proportion marked ‘I’ being p. 

To justify the application of our second picture to a concrete case, we should have to be 
satisfied that the conditions of process A and those of process B were sufficiently stable, in 
a statistical sense, to justify the formation of the notions corresponding to p, and p,. We 
should further have to make sure that our selection of samples of m and n respectively was 
for practical purposes random. And finally, we should have to be reasonably sure that the 
conditions of test themselves had practically no influence on the results of the test—that the 
test used revealed a real property of the article tested, rather than a property of the in- 
dividual conditions of test. 





p'(l—py 


Another type of abstract experiment 


Another case of common occurrence may be represented by a single urn, containing balls 
each of which carries two marks—one mark being either A or B, the other mark being 
either ‘I’ or ‘II’. The experiment consists in drawing N balls from the urn, at random, and 
examining their markings. If the proportion of balls marked ‘AI’ is p,,, while p,,, Pas, Poo 
similarly represent the proportions of the other markings i). the urn, the probability asso- 
ciated with Table 1 in this case is 

! 
oh Par Pei Pao Pb 2 (4) 
by the multinomia! theorem, provided the number of balls in the turn is large. In this case 
the hypothesis tested, that the markings ‘I’ and ‘II’ on the one hand, and the markings A 
and B on the other, are independent, may be put in the form 


ParPo2 = PazPor 
and, assuming that (pqi+Pa2) =p’ and (py+Py2)=1-—p’, and (pqi+P,) =p and 
(Pazt+ Poe) = 1—p, do not vanish, the probability of our result, on the hypothesis tested, 


can be expressed as N! 4 7 
albletai? i —P) (p'"(L—p'" (6) 


which differs from (3) by a factor 
. ‘ 
(pm (—p'y. 


This shows that (5) is related to (3) in much the same way as (3) is related to (1). 

This situation could present itself in our concrete case if the articles made by the two 
processes A and B were mixed up together in a common store, and the test sample of V 
were randomly drawn from this store; the subsequent conditions being as in the second case. 
Statisticians with industrial experience may perhaps feel it is unlikely that the experiment 








126 Significance tests for 2 x 2 tables 


would be performed in this way; but it must be admitted that it could have been. Cases 
such as this seem to occur more frequently in biometric investigations, where a population 
of animals is being tested for the association or otherwise of two characters. 


Nomenclature 


The name ‘double dichotomy’ has been applied generally to all experiments leading to 
results of the form of Table 1, but the foregoing analysis would suggest that it might be more 
appropriate to restrict this term to the third case we have indicated. Since the second case 
can be obtained from the third by supposing the numbers of articles made by process A 
and by process B to be fixed, we might then call the second case the (singly) restricted double 
dichotomy. Similarly, the first case would be called the doubly restricted double dichotomy. 
Such a nomenclature, apart from a lack of euphony, would be open to the objection that it 
would tend to imply that the third case was the general one, the first two being derivatives 
of it. This, in turn, would imply that the subject-matter of our investigation in cases one and 
two was in reality a four-fold universe, the restrictions on numbers being merely matters of 
experimental technique. But such is not always the case. The question implied in our second 
case presupposes two two-fold populations, which are to be compared, and no four-fold 
super-population need exist for this question to have meaning. 

We therefore propose the names ‘double dichotomy’ for the third case, ‘2 x 2 comparative 
trial’ for the second case, and ‘2 x 2 independence trial’ for the first case, though here again 
an objection on aesthetic grounds would be easy to sustain. 


Finer distinctions 

In principle it could be maintained that there is a distinction between the 2 x 2 compara- 
tive trial, as instanced above, and a restricted double dichotomy. As we have said, the funda- 
mental subject-matter of a 2 x 2 comparative trial is a pair of populations; while the subject- 
matter of a restricted double dichotomy is » four-fold population from which we happen, by 
an accident of experimental technique, to be able to extract samples in which the numbers 
of items having certain characteristics are fixed. The latter case could arise, for example, 
if an attempt was being made to discover association between colour of eyes in school- 
children and some less easily identified characteristic, such as membership of a particular 
blood-group. We could imagine that an experimenter might pick out m children with (say) 
blue eyes, and n without blue eyes, and then, having obtained his samples, he might subject 
them to a test for blood-group. The conclusions drawn from such an experiment would 
presumably be intended to apply to the population of school-children, a four-fold one relative 
to the two characteristics in question. The distinction between the two cases comes out if 
we consider what happened if, in the 2 x 2 comparative trial, all items tested turn out to be 
defective. In this case we should say that our question, whether p, = p, or not, tends to be 
answered in the affirmative. In the case of the school-children, if they all turn out to have 
the same blood-group, then no conclusion on our question about the four-fold population 
can be drawn at all. é 

Similar distinctions apply to the 2 x 2 independence trial. In the psycho-physical experi- 
ment described by Fisher (1942), where the point at issue is whether or not a lady can tell 
whether the milk or the tea has been put in the cup first, no statistical population is pre- 
supposed. The question would have meaning even if we refused to regard the order of in- 
sertion of milk or tea as ever being a matter of chance, while at the same time we regarded 














G. A. BARNARD 127 















































ases yy the lady's guess as equally determinate. The ‘statistical population’ enters into this experi- 
tion ment only in the experimental technique, via the randomization procedure used to fix the 
order of presentation of cups; it does not enter into the question being asked. In this case, 
the extreme result, in which in fact the milk was put in first every time, while the lady 
guessed every time that it was otherwise, would be taken as evidence against the lady’s 
g to claim. But such a result could by itself have no meaning for the question asked in the case 
— of a restricted 2 x 2 trial or a doubly restricted double dichotomy. 
bean Further types of experimental procedure leading to results expressible in the form of 
8 A Table | are the various sequential procedures that have been described for deciding questions 
uble of the kind we have been discussing (3, 4). Yet another procedure is one where the cunditions 
eat of trial vary from one block of tests to another—as when an open-air trial runs over several 
“a ad days of inconstant weather. Here we might suppose there were k pairs of urns, (A,, B,), 
mee (Az, B,), ..., (A,, B,). The distinctions here are, however, obvious enough, and they are 
and worth noting only in order to emphasize that the mere fact there results are presented in the 
— form of Table 1 is not in itself sufficient to specify an appropriate test cf significance. 
me Part II 
itive The significance test for the 2 x 2 trial 
-_ Roughly speaking, the object of a significance test as applied to results of the type con- 
sidered, is to answer the question: Can these results be ascribed to ‘chance’? In this form, 
the question is not sufficiently precise. If our ‘urn model’ for the 2 x 2 comparative trial is 
adequate to represent the experiment actually carried out, then the results will in any case 
we be ‘due to chance’, in some sense. What we wish to know in this case is whether a particular 
nda- kind of chance—namely, one in which p, = p, = p—can be said to account for our results. 
wid If the results are such that this explanation of them is untenable, then we may conclude 
a, by either, that our particular ‘urn model’ of the experiment is inadequate anyway; or we may 
ibers retain the model, and conclude that p, and p, must be unequal. In most cases, of course, 
nple, we shall reach the latter conclusion, since we would not have made up the urn model in 
ool. question unless we had some reasons for believing in its adequacy; but it is well to bear in 
cular mind the first alternative, in case a re-examination of the circumstances may make us change 
(say ) our minds. A point very strongly emphasized by Fisher in his book The Design of Experi- 
‘aoe ments is, that we ought to have in mind a particular ‘urn model’ before the experiment is 
roma performed, and arrange the conduct of the experiment so that the adequacy of this urn 
— model is not likely to be questioned afterwards. 
wired With the qualifications indicated, we can say that the object of the significance test we 
og ne propose to develop is, to enable a particular class of explanations of our experimental results 
ade to be ruled out as untenable. Specifically, given results like those of Table 1, we want to be 
_ able to say that they could not be accounted for by supposing that the experiment we 


eae actually performed was analogous to the urn experiment with two urns in which p, = Py = Pp. 
This raises the question, in what sense could such a supposition fail to account for the 


de observed results? Any result of the form of Table 1 could arise in an experiment of this kind, 


“7 when our supposition is true. Why, then, should we select some results of this form and say 
: Ms rib they are incompatible with our supposition ? 

: . “ In the last analysis, this question cannot be answered without an examination of what is 
irde 


meant in general bv statements involving probabilities, a point which is still the subject of 








128 Significance tests for 2 x 2 tables 


controversy. But in our particular case (if not in all cases) we can avoid giving a general 
answer to the question of what probability is, by considering the practical circumstances 
which form the setting for our particular problem, and the uses to which we propose to put 
the answer. In fact, in our case we are interested in the equality or otherwise of p, and p, 
because we want to decide which of the two processes, A and B, is to be preferred, from the 
point of view of defectives produced. To say that p, is greater than p, will mean, for us, that 
process B is preferable, and conversely if p, is greater than p,, while to say that p, and p, 
are equal will mean that there is nothing to choose between the two processes. In fact, to 
say that p, = p,, in our case, means that, if process A and process B are both used, then it 
will be found that the frequencies with which defectives appear in the two processes will, 
for practical purposes, be equal.* Thus we shall assert that results in which the observed 
frequencies, a/m and b/n, differ widely, are incompatible with the supposition that p, = p,; 
in doing so, we shall be neglecting as impossible a class of events which are in reality logically 
possible, but whose probability is small. The precise formulation of a test of significance 
then reduces to a precise formulation of what is meant by a ‘wide difference’ in the fre- 


quencies a/m and b/n, and to an evaluation of the probability of those events which are being 
neglected as impossible. 


The lattice diagram 


If we consider the first problem, of arranging results like those of Table 1 in order of the 
relative ‘width’ of the differences they indicate, a first step is the enumeration of all possible 
results in a convenient form. 

Logically, we should begin by noting that’Table 1 is really an abbreviated version of the 
results of any one particular experiment, which will to start with be like those of Table 2 
(where we have taken m = 8, n = 6, for definiteness). 


Table 2 
Urn: A A A A A A A A B B B B B B 
Mark: II I II II I Il II II I I II I I I 


But if, as we are presupposing, our urn analogy is adequate to represent the conditions of 
the experiment, the order in which the results were obtained must be irrelevant to the 
interpretation of results. If the conditions of trial varied during the course of the experiment, 
this assumption might not be correct—for example, if the trial were an open-air trial, and 
it began to rain half-way through. We are assuming that the urn analogy is adequate, and 
so we must treat all results like Table 2 which give the same values to a, b, c, d, in Table 1, 
as equivalent. Table 1 therefore stands for m! n!/a!b!c!d! distinct, but equivalent, results 
which we shall not distinguish from now on. 

If we now take rectangular-axes in a plane, we cca represent Table 1 by the point whose 
coordinates are (a,b). Thus ‘x’ in Fig. 1 represents the set of results equivalent, in the sense 
of the previous paragraph, to the results of Table 2. At the same time, all possible results 
of the experiment which gave rise to Table 2 are represented by the points of the rectangle 


* We hope that the qualifications we have attached to our statements will be sufficient to guard us 
against the accusation that we have adopted in full a ‘frequency theory’ of probability. The frequency 
interpretation is relevant to our particular problem ; other problems may involve other interpretations. 
More than one in‘erpretation may be relevant in a single problem. 

















G. A. BARNARD 129 


PQRS. We call this representation of possible results the lattice diagram.* Our problem 


may now be regarded as one of ordering tue points of the lattice diagram according to the 
‘width’ of the difference they indicate. 


Conditions S and C 


In trying to make the idea of ‘width’ of difference precise, we are up against difficulties 
similar to those attaching to the interpretation of results on the basis of incomplete infor- 
mation about the circumstances of the experiment. The information given at first was 
compatible with several distinct ‘urn models’. Similarly, the information given now is 
compatible with several different notions about ‘width’ of difference. We may be concerned 
with the arithmetical size of the difference p,—p,, or with the ratio p,/p,, or with the 
logarithm of this ratio, or with some more complicated functien. 

Logically, therefore, we should expect to set up various tests, based on various ideas of 
what constitutes ‘width’ of difference in probability or frequency (in Neyman and Pearson’s 








A Sa rs a a er ees re 
5 Utes RE Vag ig 
4 . 
3 
2 
1 
fF : Q 
O24 Be 3:48: Bo 32 
Fig. 1 
Table 3 
I II Total 
A c a m 
B d b n 
Total 8 r N 




















language, corresponding to various weight functions over the space of alternatives to the 
hypothesis tested). But here a factor which may simply be described as laziness enters in. 
If we carried our ideas to their logical conclusion, we should find ourselves constructing a 
new test for almost every new experiment we had to deal with; and the time and effort 
involved in this are too great. Consequently, we confine our attempt to producing a test 
which will be reasonably applicable to a wide class of cases of the type specified, without 
suggesting that this test is unique, or ‘best possible’. 

First, then, in our ordering of points in the lattice diagram, we propose that the same rank 
should be given to the point ((m—a), (n —b)) as to the point (a, b). This condition we propose 
to call the ‘symmetry condition’, or ‘condition S’. It amounts to saying, that if Table 1 
is to be considered as indicating a real difference between p, and p,, then so is Table 3, in 
which the labels ‘I’ and ‘II’ have been interchanged. If, when we are testing whether 

* Not the sample space of Neyman and Pearson. In the sample space, different results equivalent 
to Table 2 are represented by different points. 


Biometrika 34 9 





130 Significance tests for 2 x 2 tables 


Pa = Pp, We can say we are also testing whether 1—p, = 1 —p,, from the same point of view, 
then this symmetry condition is clearly justified.* 

Next, we propose that in our ordering, the two points which, respectively, have the same 
abscissa or the same ordinate as (a,b), and which lie further from the diagonal PR, shall be 
considered as indicating wider differences than (a,b) itself. Thus, referring to Fig. 1, the 
points immediately above and immediately to the left of the point ‘x’ are reckoned to 
indicate wider differences than the point ‘x’ itself. This condition implies that the set of 
points indicating differences as wide or wider than (a, b) will have a shape property vaguely 
related to convexity, and we call it the ‘C condition’. It means that if we consider the 
table corresponding to Table 2, with cell frequencies 


2 6 
e4 


as significant evidence of difference, then we must also consider the tables 


, FT ee Ss OS 
= 2 6 0 


as significant evidence of difference. It is difficult to imagine circumstances where this 
would not be so. 

Geometrically, condition S implies that we can in future restrict our considerations to 
points in the lattice diagram lying on or above the diagonal PR, i.e. in the triangle PRS. 
And condition C implies that, in this triangle, our ‘width of difference’ must increase as 
we go upwards or to the left. If horizontal and vertical axes are taken at any point X in this 
triangle, points in the second quadrant are associated with a wider difference than X is, 
points in the fourth quadrant are associated with narrower differences than X is. The relative 
width of differences associated with points in the first and third quadrants (excluding the 
axes) are now determined by the conditions C and S. The ordering generated by these 
conditions is thus a partial, not a total, ordering; it is, in fact, a kind of conical order, in the 
sense of A. A. Robb. We must introduce some further condition to make the ordering total. 


Probability considerations 


In many simpler cases, it is possible to distinguish those events which are considered 
incompatible with a given probability hypothesis by their relatively low probability, 
compared with other possible events. Such a simple comparison of probabilities is not open 
to us in this case, because to each point (a,b) we have, on the hypothesis tested, associated 
a function ani 


n! 
W(a,6; P) = Tiprergi P(t - PY 


which contains the ‘nuisance parameter’ p If we consider the relative position, in our 
ordering, of another point, (a’,b’), we have to consider the inequality 
W (a,b; p)< W(a’,b’; p), (6) 


the truth or falsehood of which depends, in general, on the unknown p; and there is nothing 


in the statement of the problem, nor in the experimental method, to justify any particular 
choice for the value of p. 


* Cases where p, >, is impossible are hereby neglected, strictly. 








_— —_—- — ie 





6) 


lar 





G. A. BARNARD 131 


If (a+b) = (a’ +5’), the validity or otherwise of the inequality. (6) is independent of p. 
Thus, using this inequality as a criterion for ordering our points, we can say that in the 
triangle PRS, the ‘width of difference’ must increase as we move north-west. But this is 
all that can be derived from this criterion, and it is clearly even less helpful in ordering the 
points than the condition’ C and S are. Moreover, if we recall that each point (a, 6) in the 
lattice diagram really represents a set of m! n!/a!b! c!d! distinct results, each with probability 
p’(1—>p)’, the criterion (6) loses its plausibility. 

We might try to improve the situation by associating the function W(a,b; p) with a 
number, depending on a and 6 only. For fixed a and b, this number would be a functional 
of W(a,b; p). We should clearly require that, if the inequality (6) is true for all p, then the 
corresponding inequality should be true of the numbers associated with W(a,6; p) and 
W(a’,b’; p). The simplest functionals which satisfy this condition will be the mean value, 


1 
w(a,b) = | W (a,b; p)dp, 
0 


the maximum value w' (a,b) = max W(a,); p), 
0<p<l1 
and one single value w"(a,b) = W(a,b; py). 


Circumstances could be imagined in which any of these three criteria might produce 
reasonable tests of significance. For example, in certain genetical experimerts we may have 
reason to suppose that the value p = 1/3 would occur more often than any other value. In 
such a case we might use w”, with p, = 1/3. But for general purposes taking p, = 1/3 could 
not be justified. 

We might again argue that taking w as our criterion would correspond with the assumption 
that all values of p were a priori equally likely. But some would say that such an assumption 
was never justified; while those who would admit the assumption would in strictness do so 
only if we really did know nothing about the value of p. And in the general circumstances we 
are trying to cater for, we may sometimes know something vague about the value of p— 
such as, for example, that p will be less than }. 

Neyman and Pearson have shown that the likelihood ratio, which in our case comes to be 

m™n"r's® 
a%>ed@NN 
very often gives a good basis for ordering experimental results. We feel, however, that the 


criterion we shall describe in the next section has a slightly more direct justification than 
the likelihood ratio, though the choice, is, admittedly, largely a matter of taste. 


The maximum condition 


Before setting out the final condition which, with conditions S and C, will be used even- 
tually to arrange the points of the lattice diagram in order of ‘relative width of difference 
indicated’, we need to consider the assignment of significance levels to various results. 

When we say that a given result is not significant on, say, the 5 % level, we mean that 
such a result, or one indicating a wider difference, could occur, with probability at least 0-05, 
even when p, = p,. We could believe in a theory that p, = p,, without having to suppose 
that an event belonging to a class whose. joint probability was less than 0-05 had occurred. 
Conversely, if a result is judged significant on the 5 % level, it means that no theory which 


9-2 








132 Significance tests for 2 x 2 tables 


assumed that p, = p, could account for the result obtained without supposing that an event 
of a type whose probability was less than 0-05 had occurred on the occasion in question. 

Let us now consider a specific casc, in which we choose numbers which in practice would 
be ridiculously small in order to save arithmetic. Suppose, in fact, m = n = 2, while a = 2 
and 6 = 0. It follows from conditions S and C alone that in judging the significance of such 
a result we need consider only the pr ability of this result, together with its converse, in 
which a = 0, b = 2. If p, = p, = p, the probability of results of this type is 

P = 2p*(1—p)?. 

Now suppose that we are prepared to discard as untenable theories which require us to sup- 
pose that events of probability less than 0-05 had occurred. In such a case, we should discard 
a theory which supposed p, = p, = 0-1, since in this case P = 2(0-1)? (0-9)? = 0-0162, less 
than 0-05. But we could not discard a theory which supposed p, = p, = 0-5, since in this 
case P = 0-125. Infact, our result would enable us to discard all theories involving p,, = p, = p, 
except those for which p lay in the interval 0-197 < p < 0-803. In particular practical cases 
we might be prepared, on grounds external to the experiment in question, to dismiss the 
possibility that p should lie in this interval; and in such cases we should be entitled to say 
that the result excludes the possibility that p, = py. 

It is easy to see that the above specific case is typical. Any set of points in the lattice 
diagram, considered by some criterion agreeing with conditions S and C to indicate differ- 
ences as wide or wider than those of a given result, will be associated with a probability P, 
on the assumption p, = p, = p; and this P will be a function of p, rising from zero when 
p = 0 to a maximum in the neighbourhood-of p = }, and then falling again symmetrically 
(by the S condition) to zero again at p = 1, somewhat as in Fig. 2. The given result by itself 











0 "ep 2 
Fig. 2 


will exclude the possibility p, = p, altogether, only if the significance level adopted is greater 
than P,,, the maximum value of P. If our significance level corresponds to a probability 
less than P,,, then all we can say is, that our result is incompatible with p, = p, unless their 
common value lies in a certain subset of the range (0,1). We may or may not exclude these 
latter possibilities on other grounds. 

In trying to construct our test, however, we have set ourselves the task of evaluating the 
evidence provided by our experiment alone in relation to the hypothesis p, = p,. It now appears 
that this is impossible so long as we restrict ourselves to the form, usual in such cases, of a 


simple statement that a given result is, or is not, significant on a given level. We have two 


alternatives. Either we can firld an entirely new form of statement to convey what we wish 
to express; or we can adhere to the form of statement, and try to make the situation fit the 
form as nearly as possible. Perhaps the day will come when experimenters do not require 
answers in the form of numbers, when they are sufficiently versed in generalized mathe- 
matical analysis to be content with a function (such as the function P(p)), instead of a single 











> Ss 





G. A. BARNARD 133 


number. But we have not yet reached this stage; and so we propose to take up the latter 
alternative, and try to make the situation fit the standard form of statement of significance 
tests as nearly as possible.* 

Our difficulty arises from the dependence of P on p. If the graph of P against p were a 
horizontal straight line, our difficulty would be overcome. What we propose, therefore, is 
to try to make the graph of P against p as near to a horizontal line as possible, by suitably 
adapting our idea of what is meant by ‘width of difference’. a making this adaptation, we 
shall secure that we do not violate the common-sense requirements as to the meaning of the 
term ‘width of difference’, by requiring that conditions C and S should always be satisfied. 


The maximum condition 


The condition C requires that, of all points in the triangle PRS, that indicating the 
‘widest difference’ must be the point S at the corner (Fig. 1). The function P associated with 
this point and its converse, Q, which we may denote as P(0, 6; p), is 


P(0, 6; p) = p*(1—p)? + p&(1—p)? 
and the maximum P,, occurs here when p = 3, where we have 
P,(0, 6) = 1/233 = 1-22 x 10-+. 
The condition C requires that the only points which might be considered as coming next 
after S, in order of decreasing ‘width of difference’ are (1,6) and (0,5). We have to adopt 
some principle to choose between these two. 
If (1,6) were taken next after (0,6), the function P associated with it would be 


P'(1,6; p) = P(0, 6; p)+16p"(1—p)’ 


and P7,(1,6) would come to 9/2!% = 10-97 x 10-4. On the other hand, if (0,5) were chosen 
next, instead of (1,6), we should have 


P(0, 5; p) = P(0, 6; p) + 6[p*(1—p)®+ p°(1—p)?] 

and P,,(0,5) would come to 8-58 x 10-*, the maximum vccurring when p = } + 3,/(6/70). 
Thus P,,(0, 5) is smaller than P,,(1, 6}, and this lower maximum is associated with a flatter 
curve of P(0,5; p). Since a flat curve is our aim (the horizontal line being the ideal), we 
choose (0, 5) as the point to come next after (0, 6), rather than (1, 6). 

Having chosen (0, 5) as the next ‘widest difference’ point, the C condition restricts us to 
the points (1, 6), and (0, 4), as candidates for the next position. We consequently compare 

P(1, 6; p) = P(0,5; p)+ 16p"(1 —p)’ 

with P"(0, 4; p) = P(0, 5; p) + 15[p*(1 — p)?+ p'(1 — p)*] 
and the lower value of P,, as criterion shows that (1, 6) is now to be taken. At the next stage, 
we shall have to compare the functions associated with (0, 4), (1,5) and (2,6). In this way 
we can arrange the points of the lattice diagram in order, step by step. 

The principle involved, which we call the ‘maximum condition’, may be formally stated 
as follows: 

Considering only points for which a/m is less than b/n, if the first (n—1) points (a,, b,), 
(ag, bg), ..., (@n-3> 6,1), in order of decreasing ‘width oi lifference’ have been chosen, and 


* In the example just taken we might make a kind of ‘conditional confidence interval statement’, 
that, if p existed, we should have 0-197 < p< 0-803 with confidence coefficient 0-95. 








134 Significance tests for 2 x 2 tables 


(a,,_1,6,_3) is associated with the function P(a,_,,6,_,; p), then the nth point, (a,,5,) is 
that point, of all points (a,b) permitted by the C condition, for which 
m!n! 
Pq(a,b) = max[ P(Qy_1.by i P)+ arproigi (BBY +e—2Y) | 
is least. (a,,,5,,) is then associated with the function 


m!n! 
P(A, bn; P) = P(n-1, On 13 P)+ py gi LP" — PP + PU —p)*). 


To complete the specification of the ordering, we have to legislate for the case where there 
are several points civing the same value of P,,(a, b), this value being less than that associated 
with any other permissible point. In this case we lay down that all such points are to be 
given the same rank, and the second term in the expression fer P(a,,,b,,; p) is to be replaced 
by the corresponding sum over all these points. If there are k such points at any stage, then 
the next point after them will be denoted as the (n + k)th point in the ordering. This requires, 
for example, when m = n, that the points (a,b) and (b, a) are always to be taken together. 

Finally, the significance level to be attached to the point (a,,, b,,) will be 


P,,(@,;6,) = max P(a,,,6,,; p). 
0<p<l1 


This guarantees that our test will be a ‘valid’ one, in the sense that, if we judge a result 
incompatible with the hypothesis p, = p,, on a given level of significance, then all the 
possibilities of the form p, = p, are excluded, to the given level. Thus no further information, 
external to the experiment in question, could make us decide that a result judged significant 
by our test was not in fact so (holding, of course, to a fixed significance level); on the other 
hand, we still have the possibility that other information may lead us to consider as signi- 
ficant results which appear in themselves not to be so. The formulation of our maximum 
condition is made so as to minimize this latter possibility. Our test is thus conservative, in 
the sense that we do not draw the conclusion p, +p, unless this is certainly warranted by 
the data; but it might be called ‘ progressive conservative’, because, of all such conservative 
tests, it will be the least conservative. 


Another aspect of the maximum condition 


When the author first approached the problem of analysis of experimental results of the 
type now considered, he did so from the point of view of regarding the significance level to 
be used as being fixed in advance, say at the 5 % level. From this point of view, the problem 
of constructing a test resolved itself, not into one of ordering the points in the lattice diagram, 
but into one of choosing a region, or set of points in the lattice diagram, such that any point 
belonging to this region could be regarded as evidence of inequality of p, and p,, on the given 
level of significance. The condition of symmetry required that such a region should consist 
of two similar parts, one above the diagonal PR, and one below it. The condition C required 
that the part of the region lying above the diagonal PR should be so shaped that if a point 
X belonged to the region, then so would all points lying north or west of X. There remained 
the problem, to decide which of the many regions satisfying these two conditions should be 
the one adopted. 

To settle this, to any such region R we can associate a function 


'n! 
P(R: p) = See 
Oh P)= 2 palbleldl et —?) 








and | 


no! 





— ae OO ee OY 


. 





G. A. BARNARD 135 


and such a region will give a ‘valid’ test of significance provided that 
Max P(R; p) < 0-05. 


0<p<l1 
There will not be so many regions satisfying this validity condition as well as the conditions 
S and C. We proposed, therefore, to select that region from among these, which had the 
greatest number of points in it. This last condition was what we then called the ‘maximum 
condition’. The fact that this region would not be unique in cases where m = n was taken 
care of by requiring a subsidiary symmetry condition that in such cases (a,b) and (b,a) 
should always be taken together. 

What we have now adopted as the ‘maximum condition’ can be seen to be related to this 
earlier version, by the consideration that, roughly speaking, apart from effects due to the 
discreteness of the lattice diagram, holding the number of points in the region constant, and 
then choosing the region which gives the lowest value for P,,, as we do now, comes to the same 
thing as holding P,, constant, and then choosing the region to have the maximum number 
of points. 

Other things being equal, the ‘ power’ of a test, in the sense of Neyman and Pearson, will 
increase with the ‘volume’ of the rejection region chosen. In this sense we can say, roughly, 
that the maximum condition secures that our test should be as powerful as possible, con- 
sistent with validity. 


Practical formulation of the test 


Some statistical tests (such as that due to Fisher, already mentioned), can be carried out 
in the form of a direct calculation from the data, without reference to any special tables. 
Most other tests require the use of special tables which, however, are for the most part tables 
of single or double entry, perhaps triple entry, if the level of significance is regarded as a 
variable. In our case, regarding the level of significance as a variable, a table of quadruple 
entry would be required. 

Ideally, a set of tables, one for each pair of values of m and n (m >) would be required. 
The table would be in the shape of a right-angled triangle, corresponding to the triangle 
PRS of Fig. 1, and divided into squares, each square corresponding to given values of a and b. 
Within each square (a,b) would then appear a number, the value of P,,(a,b). This value of 
P,,(a, 6) then is the maximum probability of obtaining the result. (a,b), or one indicating a 
wider difference, if p, = p,. A comparison of P,,(a, b) with the significance level adopted will 
then decide the significance or otherwise of our result. In any particular case we shall be able 
to see which tables, in the sense of our test, are regarded as indicating a wider difference, by 
noting which points are associated with lower values of P,,(a, b). 

In practice, it will be impossible to construct such tables for a large range of values of m 
and n. But for larger values of m and n, a test based on a normal approximation to the dis- 
tributions involved will be quite adequate for practical purposes. In fact, the test we have 
proposed will itself approximate, in some sense, to a test based on the normal distribution, 
though we do not enter into a detailed discussion of the relationship between the two tests 
here.* Tables are thus required for our test only for-small values of m and n. In spite of 


advice by statisticians to the contrary, such small values of m and n continue to occur 
frequently in practice. 


* The general question of the sense in which tests are regarded as ‘asymptotically approaching’ 
normal tests is a subject for another paper. Professor Pearson’s paper which follows, bears on this point. 








136 Significance tests for 2 x 2 tables 


In the Appendix we give specimen tables for the cases where N = 14. The comparative 
figures for the Fisher test, also given in the Appendix, indicate that the differences between 
the two tests are appreciable. An exploration is now under way into larger values of m and n, 
and it is hoped to report on this in due course. 


Other applications of the C.S.M. procedure 


We have spoken of our test as the C.S.M. test, as if the case dealt with above were the only 

case 10 which the procedure adopted was applicable. But similar methods could be used 
in many other cases. In particular, a method closely following the one we have used might 
be applied to the case we have called the double dichotomy, which differs from the 2 x 2 
comparative trial in that two ‘nuisance parameters’, p and p’ are present, instead of only 
one. The 2-dimensional lattice diagram of the 2x 2 trial is replaced by a 3-dimensional 
regular tetrahedron of points with homogeneous coordinates (a,b,c,d), connected by the 
a at+b+c+d=N. 
Two opposite edges of this tetrahedron correspond to m = 0 and n = 0, and sections of the 
tetrahedron by planes parallel to these edges will look exactly like lattice diagrams for the 
2 x 2 case and within these sections, relative probabilities will behave just as in the 2 x 2 case. 
An examination of the possibilities, however, indicates that not much is to be gained by a 
detailed treatment. The C.S.M. test for 2 x 2 comparative trials will be a valid test if applied 
to double dichotomies. It will err somewhat on the side of ‘conservatism’, but the error 
does not appear to be large, except when the numbers involved are exceedingly small. 

It is with a view to further applications of the approach used in this paper that we have 
retained the C condition as a separate requirement, although it is easy to see that it could 
be absorbed into the M condition as we have given it. 


In writing this paper the author has had great personal help and encouragement from 
Prof. E. 8. Pearson, to whom he wishes to express his very deep thanks. 


SUMMARY 


In Part I we discuss various types of experiment, each of which may give rise to results 
in the form of a 2x2 table. It appears that significance tests which may be appropriate 
for one type of experiment will not necessari!y be appropriate for another. 


In Part II a test is developed for experiments of the type called ‘2 x 2 comparative 
trials’. 


APPENDIX 


Tables for the CSM test 
Three tables are given below to illustrate the application of the ideas given in the main paper 
to the construction of a test for 2 x 2 comparative trials. The cases covered are pairs of 
samples, sizes (7, 7), (8, 6), and (9, 5). The small figures in brackets in the (7, 7) table gives 
significance levels on Fisher’s ‘exact’ test for 2 x 2 independence trials, for comparison. 
Only half of the (8, 6) and (9, 5) tables are given; the missing parts can be filled in by 
symmetry. The following examples show the meaning and use of the tables: 











ts 


ve 


er 
of 
es 
n. 


by 





G. A. BaRNnaRD 137 


Example 1. Two boxes, each containing a large number of compenents, are to be tested 
for comparative quality measured by the respective proportions of defective components 
they contain. Two samples, each of seven components, are taken, at random, one from each 
box. One sample gives four defectives, the other, none. What is the significance of this result, 
in relation to the hypothesis that the boxes have the same quality ? 

Answer. Entering the (7, 7) table at the point (0, 4), we find the number 2-4. This means 
that the result is evidence against the hypothesis, on the 2-4 % level of significance. 


Table form =n =7 
7 0-012 0-18 0-70 2-4 7:5 20 -— — 


(0-058) (0-23) (2-1) (7-0) (19) (46) 

6 O18 1:3 57 Bea eee aoe 
(0-23) (2-9) (10) (27) 

5 O70 5&7 21 ee ares eS pap 
(2-1) (10) (29) (46) 
4 24 13 as Sa sae ee ee 
(7-0) (27) (19) 
3 15 hia ee ta go 13-24 
(19) (27) (7-0) 
2 20 Spit we eh eR 
(46) (29) (10) (2-1) 
1 <i wl - jose 13 6&7 13 O18 
(27) (10) (2-9) (0-23) 
0 pa te 20 75 24 O70 O18 0-012 
(46) (19) (7-0) (2-1) (0-23) (0-058) 
0 coe 3 4 5 6 1 


More precisely, what is asserted is, that the maximum probability of getting a result not 
less significant than that obtained, is 0-024. And the results which are not less significant 
are those which correspond to points in the table with numbers not greater than 2-4, viz. 
(0, 4), (7, 3), (0, 5), (7, 2), (0, 6), (7, 1), (0, 7), (7, 0), (1, 6), (6, 1), (1, 7), (6, 0), (2, 7), (5, 0), 
(3, 7), (4, 0). By suitable choice of the propurtion defective, we could construct a pair of 
boxes, of equal quality, which would give samples falling in this group 24 times out of 1000, 
on the average; but we could not, by any choice of proportion defective, retain equal quality 
and yet have results in this group more often than 24 times in 1000. 


Table form = 8, n = 6 


0 1 2 3 2 5 6 7 8 
6 0-012 O18 O71 25 5:3 13 a — pat =e, 
5 0-085 1:3 6-6 ll a — — a ase es 5 
4 0-44 3-9 19 — o- — — = = a 4 
3 1-9 16 — — — — ens =a at 6-3 3 
2 8-0 — — — — — — 20 3-8 0-86 2 
1 23 — — — — — 14 7-4 1-3 0-13 1 
0 — — — 16 10 5-3 2-3 0-62 0-19 0-012 0 
0 1 2 3 4 5 6 7 8 9 


Table form = 9,n = 5 


Example 2. The situation is as before, except that the first sample has nine components, 
none of them defective, while the second sample has five components, four of them defective. 





138 Significance tests for 2 x 2 tables 


Answer. Here, to use the table as given, we have to compare numbers effective, rather 
than numbers defective—viz. we consider the pair (9, 1) rather than (0, 4). Entering the 
(9, 5) table at (9, 1) we find 0-13. The result is evidence against the hypothesis of equal 
quality, on the 0-13 % level of significance. 


Thanks are due to Miss Lang, who has checked the computations. 


REFERENCES 


FIsHER, R. A. (1941). Statistical Methods for Research Workers, 8th ed. Edinburgh: Oliver and Boyd. 
FisHEr, R. A. (1942). The Design of Experiments, ch. u. Edinburgh: Oliver and Boyd. 








er 
he 
al 


rd. 


=e 





CSA RUSE, RN TT ee 





PER aE re ok a hd oa 





[ 139 ] 


THE CHOICE OF STATISTICAL TESTS ILLUSTRATED ON THE 
INTERPRETATION OF DATA CLASSED IN A 2x2 TABLE 


By E. S. PEARSON 


CONTENTS 
PAGE 
(i) Introductory ‘ . . : 2 : . : ‘ : ae 
(ii) The choice of statistical belts , 142 
(iii) Application of this. Risin to the analysis of data classed i ina 9 x 2 table 144 
(iv) Problem I . : . : ‘ : . 144 
(v) Problem II . ‘ 3 : ; > eT 
(vi) Solution of Problem Il, using the penned approximation : 2 ‘ 6, = 
(vii) The classical approach to Problem II . . : : é , 2 aa 
(viii) Problem IIT ‘ ‘ : : i ; : : a ‘ - ae 
(ix) General comment : ‘ . ; ‘ ‘ ‘ é ? ‘ » 160 
References . ° ; . . m . ‘ : ; q * - 163 
Appendix . ; ° ‘ ‘ : ; ‘ : A : . - 164 


(i) INTRODUCTORY 


1. The problem of testing the significance of a difference between two proportions is one 
which receives early attention in text-books on mathematical statistics, and it might be 
thought to be one of the questions whose final solution lies behind us. It is a problem whose 
simplicity makes it easy to examine the logical cogency of the-methods put forward for its 
solution, but, on examination, it is evident that they have not yet been rounded off satis- 
factorily. The origin of the present paper lies partly in an investigation commenced in 1938 
and discussed at the time in College lectures, and partly in recent correspondence in Nature 
in which G. A. Barnard (1945a, 6) and R. A. Fisher (1945a) have taken part.* This 
correspondence has suggested that in a problem of sueh apparent simplicity, starting from 
different premises, it is possible to reach what may sometimes be very different numerical 
probability figures by which to judge significance. 

2. Such a difference in levels of significance in the solution of an everyday problem is 
obviously puzzling to the users of statistical methods who are accustmed to accept the 
technique as an established procedure and have not the opportunity fc: a critical examina- 
tion of the conditions under which probability theory is brought to bear as a guide to action. 
For the question here at issue is a fundamental one of why and how our judgement is in- 
fluenced by the calculation of a probability, and the dilemma raised by the Barnard-Fisher 
correspondence can only be answered in terms of our views on the practical function of the 
theory. We may all agree that in practice we use probability figures derived from an analysis 

of numerical data to help us to make up our minds on the next step, whether in experi- 
mental research or executive action. But what form of presentation of the probability set-up 
is likely to result in the greater number of sound decisions is likely to be always a matter for 
differences of opinion. 


3. Ali that I can do is to approach the iil of the 2 x 2 table from the viewpoint 
which appears most helpful to me. In the preceding paper Mr Barnard has elaborated the 


* There was also an earlier discussion on the same subject between E. B. Wilson (1941, 1942) and 
R. A. Fisher (1941). 









140 Choice of statistical tests 


views expressed in his letters to Nature. Such discussion is, I believe, desirable, even though 
controversial issues are raised. For the value of the whole elaborate structure of the 
modern theory of mathematical statistics depends at least in part on the sense in which the 
individual statistician appreciates the meaning of the probability model he is using when 
drawing the practical conclusions from his analysis of data. I have used the words ‘in part’, 
for it is true that the analytical process of applying the statistical technique to experi- 
mental data may in itself be enormously illuminating even without paying any close regard 
to a final probability figure. Such is the case, for example, with the technique of analysis of 
variance, where the mere process of breaking up a total sum of squares into parts with which 
different sources of variability can be associated, brings with it a reward in clear thinking 
even without the application of a probability test. 


4, There is a very wide variety in the types of situation in which probability theory is 
introduced to help in reaching a decision as to further action. 

(A) At one extreme we have the case where repeated decisions must be made on results 
obtained from some routine procedure carried out under controlled conditions. 

(B) At the other is the situation where statistical tools are applied to an isolated investiga- 
tion of considerable importance in which many of the issues involved in the conclusion can 
hardly be assessed in numerical terms. 


5. Two situations of this kind, in which the statistical technique involved is that of testing 
the significance of a difference between two proportions, may be illustrated from problems 
arising in the ‘proof’ of armour-piercing shot or shell. 


6. Example of type A. In the proof of small anti-tank, armour-piercing shot it might be 
decided to set aside, as a standard, a batch of shot whose quality has been established by 
special trials; against this standard, later batches can be compared. The variable measured 
is the proportion of shot which fail to perforate a plate of specified thickness when fired with 
a given striking velocity. The use of standard shot is necessary for calibration purposes, 
because there are inevitable changes in toughness from one proof plate to another and only 
a limited number of shot can be fired at a single plate. Then the situation might be summed 
up as follows:* 

Aim of proof. To ensure that as few batches as possible are passed into service which 
are less effective than the standard. 

Method of proof. Twelve rounds of the standard and twelve of the batch under test to be 
fired, round for round, against a sirgle test plate and a record kept of the number of failures 
in each group, say a and 5. 

Routine sentencing rule. This should lay down a ready means of determining, from a 
knowledge of a and b, whether to class the new batch as inferior to the standard or not. 

Assumptions accepted in using rule. That the two samples of twelve shot have each been 
randomly selected from the much larger batches. That against the particular plate used, a 


proportion p, of the standard and p, of the new batch would fail to give satisfactory per-. 


foration at the specified striking velocity. That while p, and p, would be different for other 
plates, if p, > p, for one plate, it will be so for all other plates. The objective is to segregate 
batches of shot for which p, > 7. 


* It has been somewhat simplified for illustrative purposes, e.g. complete control of the striking 
velocity is not in practice possible. 














SSO >) SRE RT 








E. 8. Pzarson 141 


7. Example of type B. Two types of heavy armour-piercing naval shell of the same 
calibre are under consideration; they may be of different design or made by different firms. 
Since the cost of producing and testing a single round of this kind runs into many hundreds 
of pounds, the investigation is a costly one, yet the issues involved are far reaching. Twelve 
shells of one kind and eight of the other have been fired; two of the former and five of the 
latter failed to perforate the plate. In what way can a statistical test contribute to the 
decision which must be taken on further action? 


8. In dealing with Example A the guiding principle followed in seeking help from the 
theory of probability can be very simple. We can set as our object a rule which: 

(i) will result in an increasing chance of detecting that p, > p,, the larger the difference; 
(ii) willleave only asmall chance ofsegregating thenew batch wrongly when, in fact, p, < p,. 
Diagrammatically the rule would consist in segregating the new batch when the point (a, b) 

falls within some such area as that shown shaded in 

Fig. 1. In this problem involving a routine pro- ,, 
cedure, it is the long-run frequency of different con- N 
sequences of the proof sentencing which is of 9 
importance, and probability theory is introduced to 
provide a measure of expected frequency. This 3} 
method of introducing the theory of probability into 

this proof problem is not necessarily the only one ) ©& 
that could be adopted in fixing a routine procedure, 

but it is a simple one and, since simplicity has the 4 
merit of appealing to the user’s understanding, it has 
great advantages. 














9. When dealing with Example B a very con- 0 
siderable number of factors must be weighed in 
the balance, and the result of a statistical test of 
significance could never be the over-riding one. 
There will be other information as to the effect of changes in shell design, possibly from 
shell of different calibre; information as to the uniformity in. quality of output of the 
firm or firms concerned ; questions of cost and of general policy. He would be a bold man who 
would attempt to express these in numerical terms. Whereas when tackling problem A 
it is easy to convince the practical man of the value of a probability construct related to 
frequency of occurrence, in problem B the argument that ‘if we were to repeatedly do so 
and so, such and such result would follow in the long run’ is at once met by the common- 
sense answer that we never should carry out a precisely similar trial again. 


—_ 
o 
~— 
Nn 


10. Nevertheless, it is clear that the scientist with a knowledge of statistical method 
behind him can make his contribution to a round-table discussion, provided he has 
acquired a grasp of the practical issues. Starting fron: the basis that individual shell will 
never be identical in armour-piercing qualities, however good the control of production, 
he has to consider how much of the difference between (i) two failures out of twelve and 
(ii) five failures out of eight is likely to be due to this inevitable variability. There may be a 
number of ways of sizing up the position involving different assumptions or hypothetical 
constructs; he may follow one or several of these. The value of his advice is dependent almost 








142 Choice of statistical tests 


entirely on the soundness of his scientific judgement, and very little on whether his back- 
room calculations have been based on inverse or direct probability or on an appeal to 
fiducial argument. 


11. How far, then, can one go in giving precision to a philosophy of statistical inference? 
It seems clear that in certain problems probability theory is of value because of its close 
relation to frequency of occurrence; such seems to be the case for my Example A. Tests can 
be built up to satisfy the practical requirements in this field. In other and, no doubt, more 
numerous cases there is no repetition of the same type of trial or experiment, but all the 
same we can and many of us do use the same test rules to guide our decision, following the 
analysis of an isolated set of numerical data. Why do we do this? What are the springs of 
decision? Is it because the formulation of the case in terms of hypothetical repetition helps 
to that clarity of view needed for sound judgement? Or is it because we are content that the 
application of a rule, now in this investigation, now in that, should result in a long-run 
frequency of errors in judgement which we controi at a low figure? On this I should not care 


to dogmatize, realizing how difficult it is to analyse the reasons governing even one’s own 
personal decisions. 


12. That the frequency concept is not generally accepted in the interpretation of statis- 
tical tests is of course well known. With his characteristic forcefulness R. A. Fisher (19455) 
has recently written: ‘In recent times one often repeated exposition of the tests of signi- 
ficance, by J. Neyman, a writer not closely associated with the development of these tests, 
seems liable to lead mathematical readers astray, through laying down axiomatically, what 
is not agreed or generally true, that the level of significance must be equal to the frequency 
with which the hypothesis is rejected in repeated sampling of any fixed population allowed 
by hypothesis. ‘This intrusive axiom, which is foreign to the reasoning on which the tests of 
significance were in fact based seems to be a real bar to progress....’ 


13. But the subject of criticism seems to me less an intrusive mathematical axiom than 
a mathematical formulation of a practical requirement which statisticians of many schools 
of thought have deliberately advanced. Prof. Fisher’s contributions to the development of 
tests of significance have been outstanding, but such tests, if under another name, were 
discovered before his day and are being derived far and wide to meet new needs. To claim 
what seems to amount to patent rights over their interpretation can hardly be his serious 
intention. Many of us, as statisticians, fall into the all too easy habit of making authoritative 
statements as to how probability theory should be used as a guide to judgement, but 
ultimately it is likely that the method of application which finds greatest favour will be that 
which through its simplicity and directness appeals most to the common scientific user’s 
understanding. Hitherto the user has been accustomed to accept the function of probability 
theory laid down by the mathematicians; but it would be good if he could take a larger share 


in formulating himself what are the practical requirements that the theory should satisfy 
in application. 


(ii) THE CHOICE OF STATISTICAL TESTS 


14. One approach to follow in determining tests to be applied to the 2 x 2 class of problem 
follows the lines that Neyman and I have adopted since 1928 in dealing with tests of statis- 
tical hypotheses. Let me first recapitulate in broad terms the steps in that approach when 
applied to a problem where the universe of possible observations can be represented by a 











sis- 
en 





E. S. PEarRson 143 


finite set of discrete points. A test of significance may be described as a method of analysis 
of statistical data which helps us to discriminate between alternative theories or hypotheses. 
In order to make use of the theory of probability in the sense here understood, a random 
process must either have been purposely introduced or be assumed to have been present in 
the collection of data; then the hypothesis very often concerns the values of parameters 
contained in the probability laws which, in the conceptual sphere, form the mathematical 
counterpart of the sampling distributions of experience. 


15. We proceed by setting up a specific hypothesis to test, H, in Neyman’s and my 
terminology, the null hypothesis in R. A. Fisher’s. At the same time, in choosing the test, we 
take into account alternatives to H) which we believe possible or at any rate consider it 
most important to be on the look out for. Thus we wish the test to have inaximum dis- 
criminating power within a certain class of hypotheses. Three steps in constructing the test 
may be defined: 

Step 1. We must first specify the set of results which could follow on repeated application 
of the random process used in the collection of the data; this may be termed the experi- 
mental probability set. 

Step 2. We then divide this set by a system of ordered boundaries or contours such that 
as we pass across one boundary and proceed to the next, we come to a class of results which 
makes us more and more inclined, on the information available, to reject the hypothesis 
tested in favour of alternatives which differ from it by increasing amounts. 

Step 3. We then, if possible, associate with each contour level the chance that, if H, is 
true, a result will occur in random sampling lying beyond that level. 

This rather crude statement of procedure will be developed in more detail in discussing 
the problems that arise in connexion with the 2 x 2 table. 


16. Notes on these points. (a) Step 1. This involves the definition of what Neyman and 
I have termed the sample space, W. The application in three forms of the 2 x 2 problem is 
discussed in paragraphs 19, 27 and 46 below. 

(b) Step 2. For a given hypothesis under test there may be a number of ways of deriving 
a system of contours, and only in certain cases can there be said to be complete agreement 
on which is the ‘best’. Practical expediency will often carry weight in the choice. It is widely 
accepted that the choice cannot be made without paying regard to the admissible hypotheses 
alternative to H,, whether this process is given formal precision or taken as a broad guide. 
In our first papers (Neyman & Pearson, 1928a,b) we suggested that the likelihood ratio 
criterion, A, wes a very useful one to employ in determining a family of contours which 
would be ordered in relation to our confidence in the hypothesis tested when set against 
the background of admissible alternatives. Thus Step 2 preceded Step 3. In later papers 
(Neyman & Pearson, 1933, 1936 and 1938) we started with a fixed value for the chante, e, 
of Step 3 and determined the associated contour, taking account of what we termed the 
power of a test with regard to the alternative hypotheses. The family of Step 2 followed 
on giving decreasing values to «. However, although the mathematical procedure may 
put Step 3 before 2, we cannot put this into operation before we have decided, under 
Step 2, on the guiding principle to be used in choosing the contour system. That is why 
I have numbered the steps in this order. 

(c) Step 3. If this can be accomplished, we have what Neyman and I called control of the 
‘Ist kind of error’. In problems where, as below, we are concerned with discrete rather than 








144 Choice of statistical tests 


continuous probability distributions (e.g. for the binomial, the Poisson, the multinomial 
and the hypergeometric distributions), this objective cannot always be achieved, and it 
may be necessary to be satisfied with a knowledge of an upper limit of the chance of rejecting 
the hypothesis tested when it is true. 


(iii) APPLICATION OF THIS APPROACH TO THE ANALYSIS OF DATA CLASSED IN A 2 x2 TABLE 
17. The frequencies of the data in the table may be defined in the following notation: 











Table 1 
Col. 1 Col. 2 Total 
Row 1 a c m 
Row 2 b K d n 
Total r 8 _ oN 

















If we follow in turn the steps defined above to determine the method of interpretation of 
such data, the requirements of the appropriate tests are seen to follow very simply, although 
mathematical or computational difficulties arise in implementing them. On taking Step ! 
we can separate out at once the three types of problem which Barnard has differentiated ;* 
these I shall call Problems I, II and III. Théy are distinguished by the sample space having 
1, 2 and 3 dimensions respectively. From the mathematical point of view it might seem more 
logical to take them in the reverse order, adding first one and then a second restriction to 
the 3-dimensioned case of Problem III. For a simple exposition, I think the feverse procedure 
of building up from I to ITI is preferable and this has been adopted in the following sections. 


(iv) Prospiem I 


18. This may be described as the test of the significance of the difference between two 
treatments after these have been randomly assigned to a group of N = m+n individuals 
(Barnard terms it the 2x 2 independence trial). To use the terminology of a particular 
application, we may say that we are observing the presence or absence of ‘reaction X’. 
The first treatment is applied to m and the second to n of the N individuals; as a result a/m 
and b/n show reaction X. 


19. In this case the random process has been applied within the group of N individuals, 
and its repetition would simply involve other random reassigrments of the two treatments 
among the N. No assumption is made as to how the N individuals were selected from some 
larger universe. The repetition may be hypothetical, in the sense that it often could not 


take place, e.g. if reaction X = death. Indeed, repetition under the same essential conditions. 


is frequently impossible in practice. But this correspondence between the frequency of 
results upon hypothetical repetition and the probability distribution of the counterpart 
mathematical model forms an accepted part of the process of reasoning whereby (following 


* Statisticians had, of course, all been more or less conscious of these differences, but, at any rate 


in my own case, it was discussion with Mr Barnard which made it easy to see the problem in its full 
clarity. 








the p1 
tested 
the sa 
on th 
not, ¥ 


20. 
be tr 


This 


Thu: 


For 


th 





if 


i _ 


= O89 ODO @O UR 


a aS SS Uv eh | LS 


—_ we 








E. 8. Pearson 145 


the present approach) we use probability theory as a basis for inference. The hypothesis 
tested is that while some individuals show reaction X and some do not, the result would be 


the same whichever treatment were applied as far as these N individuals are concerned. Thus, 
on the null hypothesis, there are r = a+ 6 individuals who will react and s = c+d who will 
not, whatever the assignment of treatments. 


20. The chance that a will react in m and b = r—a in n is, therefore, if the hypothesis 
be true, 
m!n!r!e! 


Fla | Norm} = ceteldt Ni" a) 
This expression is proportional to the coefficient of z* in the hypergeometric series 
F(a, 8, y,x) = F(-—r, -m,n—r+1,z). (2) 


Thus, taking m >n, a can assume values of 
SO) 61,440 © 2Ga, 
(ii) r—n,r—n+l1,...,7r if n<r<m, 
(iii) r—n,r—n+l1,...,.m if r>m. 
For this probability distribution, it is known (K. Pearson (1899) and Kendall (1943, p. 127)) 











4 + 4 + > 





t 

! 
' 
\ 
J 
1 
! 
' 
Vv 


' 
' 
' 
1 
1 
Vv 
<- - > 








2 Vv 6 Vv 7 
3 _— 
Treatment 2 more likely to cause reaction Treatmént | morc likely to cause reaction 
Fig. 2 
rm 
that Meana = —, (3) 
N 
mnrs 
Variance of a = o2 = —____.. 4 
¢= Wa —1 (4) 


21. For the particular case 
N=20, r=7, m=12, n=8, 


the terms in the distribution of P,{a | 20, 7, 12} are shown as ordinates in Fig. 2 and given in 
the accompanying Table 2. The experimental probability set consists of the eight alter- 
native values for a, viz. 0,1, ...,7 with which the probabilities tabled are associated if H, is 
true. Further 

Meana = @= 4-2, o, = 1-0721. (5): 


Biometrika 34 to 








146 Choice of statistical tests 


22. Next consider step 2. The purpose of the investigation is to test the hypothesis 
that the difference between a/12 and (r—a)/8 has resulted simply from a random 
partition of 20 individuals, of whom r will show reaction X in whichever treatment 
group they are included. The experiment gives r=7. The contour levels fall between 
the 8 points of the set as shown in Fig. 2; the further a lies towards the right, the more 
inclined we shall be to accept the alternative hypothesis that a/12>(r—a)/8 because 
treatment | is more effective than treatment 2. The further a lies to the left, the more 
we shall incline towards the reverse alternative. To complete Step 3, we have only to 
calculate the sums of the tail terms of the hypergeometric series, as shown in Table 2 for 
the special case. 


Table 2. Problem I. Chances for special case N = 20, r = 7, m = 12, if H, is true 




















Chance of a or less 
Chance ~ 
° of a 
True value Normal approx. 
0 0-0001 0-000 0-000 
] 0-0043 0-004 0-006 
2 0-0477 0-052 0-056 
3 0-1987 0-251 0-257 
4 0-3576 — — 
Chance of a or more 
True value Normal approx. 
5 0-2861 0-392 0-390 
6 0-0954 0-106 0-113 
7 0-0102 0-010 0-016 




















23. Having set up the machinery of the test, we come to the practical question. Beyond 
which contour levels must a fall before we infer that there is a treatment difference? Not, 
I think, in the example, ifa were 3, 4 or 5; possibly ifa = 6, more probably ifa = 2 and almost 
certainly if a = 0, 1 or 7. Were we to fix as critical levels those between a = 1 and 2 on the 
one hand, and between a = 6 and 7 on the other, then we should be guided in our decision 
by the following knowledge: if there were no treatment difference, so that seven out of the 
twenty individuals would have shown reaction X whichever treatment were applied, then 
the chance under random assignment of treatrhents that a < 2 or > 6 is only 0-014 or 1 in 70. 
Had we taken the critical levels between 2 and 3 and between 6 and 7, the corresponding 
chance would be 0-062 or 1 in 16. This summing up in terms of probability helps towards 
the balanced decision on the next practical step to be taken, because it helps us to assess the 
extent of purely chance fluctuations that are possible. It may be assumed that in a matter 
of importance we should never be content with a single experiment applied to twenty in- 
dividuals; but the result of applying the statistical test with its answer in terms of the chance 
of a mistaken conclusion if a certain rule of inference were followed, will help to determine 











esis 
lom 
ent 
een 
ore 
use 
ore 
’ to 
for 








E. 8. PEarson 147 


the lines of further experimental work and the degree of confidence with which we proceed 
provisionally to adopt a new technique. 


24. An experiment falling under this head has the advantage that the random process 
introduced is under complete control. The analysis will give an answer in probability terms 
whether the N individuals have been randomly selected from a larger whole or not. But this 
answer is limited in the sense that it relates only to the N; if we wish to draw conclusions 
about a wider population or populations, then a random selection of the N or, separately, 
of both its parts m and n is needed. Thus we come to Problems II and III. 


25. Approximation to the hypergeometric terms. When dealing with small numbers, the 
calculation of the tail terms of the series may not be laborious, but it soon becomes so when 
r is large. An obvious approximation is that obtained by using an integral under the normal 
curve with the mean and standard deviation of équations (3) and (4) to represent the sum 
of the hypergeometric terms. As usual when approximating to the sum of the terms for 
z=a,a+1,a+2, ..., etc., of a discrete probability distribution by the integral under a 
continuous curve, we take this integral from the point + = a—}. Thus Fig. 3 shows the 
normal curve 1 


p(x) = Jeno, exp [—4(x—@)*/o3], (6) 


with @ and o, as in equations (5), and the approximation to the sum of the hypergeometric 
terms for a = 6 and 7 is 0 
| p(x) dz, 
5.5 


represented by the area marked with cross-hatching. The approximations for different 
levels are shown in Table 2, and are seen in this case to be quite adequate for the purpose of 
the test. Further comparisons are made in the Appendix, and it appears that provided m 
and n are fairly nearly equal, as they are likely to be in most planned experiments of the 
Problem I type, the normal approximation is surprisingly good. Yates (1934) has suggested 
a method of further correction. 


26. The correction for continuity. In the 2 x 2 table connexion, the improvement obtained 
by taking the normal integral (i) from « = a—} ifa>d@ or (ii) from z = a+ } ifa<G (so that 
we are summing for the lower tail), was pointed out by Yates (1934) and has often been 
termed ‘ Yates’s correction for continuity’. It is, however, the natural adjustment to make 
on the basis of the Euler-Maclaurin theorem, when approximating to a sum of ordinates 
by an integral and without wishing to detract from the value of Yates’s suggestion in this 
particular problem, it should be pointed out that the adjustment was used by statisticians 
well before 1934, when employing a normal or skew curve to give the sum of terms of a 
binomial or hypergeometric series.* 


(v) Prosiem IT 


27. This may be described as the test of whether the proportion of individuals bearing a 
character A is the same in two different populations, from each of which a random sample 
has been drawn, i.e. the test of the hypothesis that 


P(A) = p(A) = p, (7) 


* The method was in use in the Department of Applied Statistics when I joined the staff in 1921, 
and may have been current many years before that. 


30-2 





148 Choice of statistical tests 


where p is some common but unspecified proportion. Barnard describes this as the case of 
the 2 x 2 comparative trial. Here m individuals have been drawn at random from the first 
population and n from the second, and it is found that a/m and b/n, respectively, bear the 
character A. The conditions are assumed to be such that if the random procedure of selection 
were repeated, the appropriate probability distributions for a and t would be given by the 
terms of binomial expansions. Table 3 shows the observed results. 











Table 3 
Pose ag dieu A Total 
Ist sample a ‘ inn 
2nd sample b d ri 
Total r ‘ N 























a7 @ if ® * e 
6 7 11 12 


@ 
.o) 
_ 
o 


Fig. 3. The curves ABC and A’B’C’ represent the significance contours L, and L;, respectively. 


In this problem there have been two applications of a random selection process, not one 
as for Problem I, and the experimental probability set consists of the (m+ 1) (n+ 1) alter- 
native values of the doublet (a,b) (0<a<m,0<b<n) which can be represented in the lattice 
diagram shown in Fig. 3 for the special case m = 12, n = 8. It might, of course, be argued 
that in the hypothetical repetition of the selection process m and'n need not remain constant, 
but this, I think, would introduce an unnecessary complication into the probability set-up. 








fol 








E. S. PEARSON 149 


28. The question before us is whether the result (a,b) is consistent with the hypothesis 
H, defined in equation (7) above, or whether it suggests that either p, > p, or that p, < po. 
A little reflexion shows that we have no reason to reject H, if the point (a,b) lies near the 
diagonal line on which a/m = b/n, but, broadly speaking, are more and more likely to do so 
the farther the point falls from this line in the direction of the corners (0,7) and (m, 0) of the 
lattice diagram. This statement requires amplification. In defining the significance contours 
we may consider the following question: If H, is not true, what departures from equality 
in p, and p, do we regard it of equal importance to detect? Should the power of the test be 
roughly the same for constant values, for example, of 

(2) Pi-Pe (6) 9,/De or (c) ree / ey 

The procedure which I have adopted in the sections which follow is frankly one of ex- 
pediency. I have not considered in detail how to choose a family of significance contours 
satisfying requirements formulated in advance, but have taken those suggested by the 
customary large-sample procedure which gives contours of the form ABC, A’ B’C’ drawn 
in Fig. 3. These will, I believe, make the power of the test to detect a difference more nearly 
dependent on the ratio of the odds given by (c) than on either of the expressions (a) or (b). 
E. B. Wilson (1941) chooses the expression (a). This point, however, needs further investiga- 
tion. It should be noted that a similar problem, in the case where the sampling distribu- 
tions follow the Poisson law, was discussed very fully by Przyborowski & Wilenski (1939). 


29. Besides involving a 2-dimensional instead of a 1-dimensional experimental pro- 
bability set, Problem II differs from Problem I in that we need an answer which is indepen- 
dent of the unknown common probability p of the null hypothesis. In Problem I the part 
of p was played by the fraction r/N given by the data. We are concerned now with what 
Neyman and I (Neyman & Pearson, 1933) have termed a composite hypothesis, and were 
it possible would like the contour levels to bound regions which are ‘similar to the sample 
space with regard to the parameter p’ (loc. cit. p. 313) (i.e. are independent of p). The 
following considerations show the lines along which a first attack of the problem can proceed. 


30. If H, is true and equation (7) holds, then the probability of the observed result may 
be written* 


m! n! 
P,{a | p,m} x P,{b | p, n} = are Pl — PY x ppg PP) (8-1) 
N! min'r!s! 
= rial? —PP opted Ni! (8-2) 
= P,{r | p, N} x P,fa| Nr, m}. (8-3) 


Thus the probability of obtaining the doublet (a, 6) in sampling from two populations with 
a common p may be regarded as the product of two terms: 

(i) The probability that a+6 = r or that the point (a, b) in Fig. 3 falls on a diagonal line 
on which r = constant. This probability, P,{r | p, N}, is the (r+ 1)th term in the expansion 
of the binomial 

((1—p) +p). | 

(ii) The relative probability, given r, of the observed partition into a and 6 = r—a; this 
is independent of p and is identical with the expression P,{a | N,r,m} of equation (1), i.e. is 
proportional to a term of the hypergeometric series (2). 


* It will be seen that P,{ } has been used to denote a hypergeometric probability and P,{ } a 
binomial probability. 








150 Choice of statistical tests 


31. If, now, it were possible to draw a boundary line L, such as ABC shown in Fig. 3, 
cutting off at the end of each diagonal, r = constant, a group of points (a,r—a) such that 


x [P,{a | N, tr; m}] = é, (9) 


where ¢ is a fraction between 0 and 1 chosen at will, then the requirement of Step 3 would 


be satisfied. For in rejecting H, when (a, b) fall beyond this boundary,* the chance of doing 
so if H, were true would be 


N N 
Ptr |p, N}xe] =e x BLPafr |p, N}] =e, (10) 


i.e. would be independent of the unknown common p of the hypothesis tested. The test 
would then be analogous to ‘Student’s’ test for the significance of the difference between 
two means, where we have a system of contour levels L, each associated with a chance e, 


independen’ of the values of any unknown parameters which are irrelevant to the com- 
posite hypothesis tested. 


32. Unfortunately, this objective cannot be achieved because we are not dealing with 
continuous probability distributions and P,{a|N,r,m} exists only at discrete, integral 
values of a. If we follow the present line of approach, all that is possible is to take contour 
or significance levels which cut off from an end of each diagonal, r = constant, a group of 
points for which 


ELA {a | N,r,m}] = 2, <e. (11) 


Then, in rejecting H, when (a, 6) falls beycnd such a contour, we know that the chance of 
doing so, if H, is true, will be 


N 
lPilt |p. N} x B,]<e. (12) 


It is clear that the amount by which the probability falls below ¢ will be a function of p, 
and that in taking Step 3 we are only associating with each significance level L, an upper 
limit, €, to the probability of rejecting H, when it is true. 


33. We have still, of course, to determine the most appropriate system of significance 
levels and to set out a ready means of finding an upper limit, ¢, associated with the level on 
which an observed doublet (a, 5) falls.| Mr Barnard has broken new grov. d in 

(i) defining for this Problem II one systematic method of determining a family of levels 
L, based on certain clearly defined principles; 

(ii) determining the true upper bound to the associated probability ¢ which, in the case 
of small samples at any rate, may be considerably below that which has hitherto been used. 

Since, however, much tabling is needed before his theoretical advance can be followed 
by a practical working rule available for samples of any sizes, m and n, I think it is worth 
while describing the cruder handling of the lattice diagram which I had discussed in 1938-9 


* There would be a similar series of boundaries, L?, below the diagonal a/m = b/n, such as A’B’C’ 
of Fig. 3. 

+ The likelihood ratio 2 might be used in determining the family of significance contours, as was 
suggested in connexion with the general x* problem (Neyman & Pearson, 1928b, p. 283). In large 
samples A would approximately equal e-*’, where u is given vy equation (22) below. 














E. S. PEarson 151 


lectures. This involves, perhaps, not much more than a restatement.of what may be termed 
the classical approach to Problem II (see paras. 43 and 44 below), but it does bring out the 
difference between Problems I and II, which I think important. 


34. It may be well to emphasize here that this distinction between the handling of 
Problems I and IT is not universally accepted. Fisher has set out his approach as follows in 
a paper read before the Royal Statistical Society (1935): ‘To the many methods of treatment 
hitherto suggested for the 2 x 2 table the concept of ancillary information suggests this new 
one. Let us blot out the contents of the table, leaving only the marginal frequencies. If it 
be admitted that these marginal frequencies by themselves supply no information on the 
point at issue, namely, as to the proportionality of the frequencies in the body of the table, 
we may recognize the information they supply as wholly ancillary; and therefore recognize 
that we are concerned only with the relative probabilities of occurrence of the different ways 
in which the table can be filled in, subject to these marginal frequencies.’ 

This view has also been supported by Yates (1934). As I understand it, Fisher would refer 
the observation (a,b) to a linear set (as in my Problem I), however the data have been 
collected; this attitude follows readily if we discard the requirement that the probability 
distribution used in the test must be related to the frequency distribution that would be 
generated by repeated application of the random sampling process employed in the experi- 
ment. It will be seen that with Fisher’s approach there is a gain in simplicity in handling 
the analysis; it must remain a matter of opinion whether there is a loss in the relevance 
of the probability construct to the question at issue. It is, of course, only when handling 
small samples or in cases where (a,b) lies close to one of the corners (0,0) or (m,n) of the 
lattice that this need for choice between probability constructs is thrust upon us. 


(vi) SoLution oF PRoBLEM II, USING THE NORMAL APPROXIMATION 


35. Ifthe samples are large, the calculation of hypergeometric terms becomes lavorious 
and we turn naturally, as in so many other statistical problems, to the approximation using 
the normal curve. In fact, except when r or s are very small or m and n very different in 
magnitude, the normal curve with mean and standard deviation given by equations (3) 
and (4) provides a surprisingly good approximation to the relative probability distribution 
of a for fixed r, viz. P,{a | N,r, m} (see Appendix). Define w, as the deviate of the standardized 
normal curve for which 

oo 1 . 
é= {a9 iw'du (€<}4). (13) 
Then we can draw across the lattice diagram a significance level L, above and another L; 
below* the diagonal a/m = b/n such that 
(i) all points (a,b) for which 


u, (14) 


lie beyond, i.e. above, L,; 
(ii) and all points (a,b) for which 


lie beyond, i.e. below, Lj. 


* The words ‘above’ and ‘below’ are used in the sense of Figs. 3 and 4. 








152 Choice of statistical tests 


If we wish to take special action either when a/m is significantly less than 6/n or significantly 
greater, then we shall use both levels L, and L;; if only, however, when a/m <b/n, then we use 
L,. The corresponding probability levels would be obtained by making ¢ for the second case 
twice its value for the first. Fig. 4 shows the 247 relative probabilities P,{a | Nr, m} for the 
case m = 18, n = 12. The unbroken, stepped lines are two contour levels determined in this 
way. Purely for convenience in drawing, the level with e = 0-05 and uo; = 1-6445 has been 
put above the diagonal and that with e = 0-01 and uw», = 2°3263 below. 


36. If the normal approximation to the hypergeometric series were correct, it would 
follow that along every diagonal, r = constant, the sum of the relative probabilities for 
points above L, would satisfy the inequality (11). Hence the inequality (12) for the complete 
area of the lattice above L, would hold, whatever the value of the common p. A similar result 
would hold for the area below L. Of course, the normal approximation will not hold pre- 
cisely, particularly when r or s are small, but here we shall generally be on the safe side, in 
the sense that the hypergeometric distribution is flat-topped with abrupt ends so that the 
£, of equation (11) will be considerably less than e, and often zero. 


37. It is interesting to examine the results set out in Fig. 4 with the help of the detailed 
calculations given in Table 4. Columns (2) and (3) give, for constant r, the mean and standard 


deviation of P,{a | 30,r, 18}, while columns (4) (for Ly;) and (8) (for Lo.;) give the cut-off 
points defined by the normal approximation, i.e. 


Gy = A—4$—UgosXO, and ay =A+}+4+Ug9, XO,. (16) 


The sums of the relative probabilities P,{a|30,r,18} for a<a, and a>a, are given in 
cols. (5) and (9) respectively. Thus, for example, for r = 7 


a, = 4-2—0-5— 1-6449 x 1-1543 = 1-80, 
and the sum of the probabilities for a = 0 and 1 is 


0-0004 + 0-0082 = 0-0086. 


These are the tail sums, termed #, in equation (11). It is clear from an examination of 
cols. (5) and (9) that they are all less, and many of them very much less than 0-05 and 0-01. 
This is inevitable with a discrete distribution containing few terms. The contour levels have 
been drawn conventionally in Fig. 4 as steps passing through the half-integer points and 
not. through the cut-off points of cols. (4) and (8). Clearly, whichever way thev are drawn, 
they will separate off the same subset of the (m+ 1) (n+ 1) points in the lattice diagrant. 


38. The next question is this. If we were to use either of these levels, what in tact would 
be the chance of the sample doublet (a, b) falling beyond, if the null hypothesis were true? 
This will depend on the common value of p. The product sums 


N N N! 
Z(Atrlp.N}x hl &[ 5 e0-2rx 2, | (17) 


obtained by multiplying the expressions in cols. (5) and (9) of Table 4 by the appropriate 
binomial torms are shown for a variety of values of p in Table 5, cols. (2) and (3). It is clear 
at once how far on the safe side we are in saying that these chances are < 0-05 and 0-01 
respectively. Similar calculations were carried out for a second example, taking m = n = 10, 











ey <r 





a= = 


E. 8S. Pzarson 








WO) 


sa Sys ea 
Y rl 
ie 














10-0804 0 


Met 
ws 
aan 








| ore 

AAS 
"a Jay 7% ae 
XY Ao 
AY 


3 


v2 90 44 
- Fe S 














a 





8 0-0785 0-0444 ,0- 
. \w 


0:1016 0-0632 0-0354 10- “0175 | 0- 


Secs 
a 
mas 
ate 





0084 
1 0-0056 0-0130 0-0263 [0-0481 


LNA AA 





8 
Nd 


LP 
aaa 


Af iy 
Bete e,- 
se ee 
a1 271 3 8 3 
4 ~ toe bo ~ 
ANY 
— : : 














0085 0-0215 J 0- 








40 $0-0595 70-11 


* 





44 


> 
5 ate 0- 








ae 
yes 


0-0587 0-0334 | 





0 
0 
0 
18 a 





. ~ 
ae: 
3 





ne 56 0-02 


(ee 0-0625 10-12 





*f 
oY 
“é 





ee. 
ae a. /\cs 
rl 











0-0961 


3 


0652 vise" ate 4 0-0096 | 0 
34 
\ 


0-14 


Erexey: 


ph 


1117 0-0601 0-031310-9156 0-0075 | 0-00. 


ny Dates Tats 


a 
oo se 
3 0-2577 0-1732. 0-1094 0-06 


at: 








57 


LITT TY 


N 


Fig. 4. Hypergeometric probabilities in lattice diagram for m= 18, n= 12. 


B 


with 4 adjustment; — — — without } adjustment. 





Above diagonal, Lo9;: === with 4 adjustment; — — — without 4 adjustment. 


Below diagonal, Lj.o:: 


Normal curve approximations to significance lovele| 





Be ae Nae a — ee TD De ~ 












































8Ss 8s | sais 
BEN 7 pels — 
0g 0 HBT 0 0¢-81 0 00-81 0 OS-LI 0 0-81 0g 
6z 0 ¥S-81 0 40-61 0 69-91 0 60-91 668F-0 FLT 62 
8% 0 8E-81 0 88-81 0 89-ST 0 8I-ST 8089-0 8-91 8% 
LZ 0 O18 0 09-81 0 o8-F1 0 oE-FI L818-0 Z-91 LZ 
9% 1810-0 OL-LT 0 92-81 LIIL-O LO-FI 0 Lg-ST LLZ6-0 9-SI 9% 
rd 9900-0 LE-LI 9900-0 L8-LT 1090-0 S181 0 €9-Z1 ILIO-I 0-ST GZ 
¥% 9920-0 ¥6-91 9100-0 PP LI €1€0-0 09-21 €1€0-0 OL-ZI L160°T PFI ¥Z 
€Z 9800-0 6F-91 9800-0 66-91 9910-0 06-11 9210-0 OF IT SPST 8-EI &% 
a 9200-0 10-91 9200-0 19-91 LZL0-0 1Z-1I ¢L00-0 IL-O1 6902-1 SEI ae 
1Z 1600-0 1¢-ST 9000-0 10-91 100-0 ¥S-01 1040-0 40-01 LOSS-I 9-21 IZ 
02 1620-0 66-41 9200-0 6F-ST 6020-0 88-6 6020-0 88-6 G98Z-1 0-21 02 
61 0800-0 9F-F1 0800-0 96-F1 6890-0 42-6 2010-0 PLB os1e-1 FIT 61 
8I L610-0 16-E1 2200-0 IFPI 08€0-0 09-8 080-0 01-8 OLEE-T 8-01 81 
LI 0900-0 SE-E1 0900-0 o8-E1 G610-0 86-L 2610-0 SPL FESE-1 Z-01 LI 
91 SFI0-0 LL-ZI G100-0 LZ-€I ZLS0-0 9E-L 1600-0 98:9 S19E-1 9-6 91 
SI 8£00-0 LU-ZI 8£00-0 19-21 1080-0 9L-9 1080-0 92-9 9r98-1 0-6 SI 
I 1600-0 Lg-1I 1000-0 LO-ZI LLLO-O 91-9 SF10-0 99-9 C1981 ¥8 FI 
I G610-0 96-01 0200-0 SF IT F1¥0-0 8¢o-¢ F1¥0-0 80-9 FESE'1 SL &I 
rat 9400-0 18-01 9400-0 18-01 2860-0 +00°S L610-0 0S-F OLEE-T SL ZI 
Il Z010-0 99-6 9000-0 91-01 #290-0 FPF 0800-0 r6-S zste-1 9-9 II 
ol 6020-0 66:8 ¢100-0 6-6 1620-0 88 1*Z0-0 88-8 G98Z-1 0-9 Ol 
6 ¥£00-0 1€-8 ¥£00-0 18-8 8190-0 PEE 1600-0 ¥8°Z LOSZ:T #9 6 
~ ¢L00-0 19-2 0, IL-8 1920-0 18-2 1920-0 1¢-% 6902-1 8-F 8 
L 9210-0 68-9 0 68-L 1890-0 08-2 9800-0 08-1 EFST-T oF L 
9 0 F1-9 0 ¥9-9 9920-0 08-1 9920-0 0€-T L160-1 9-8 9 
g 0 Le-g 0 L8-¢ 1890-0 68-1 9200-0 €8-0 ILIO-1 0-8 ¢ 
P 0 99-4 0 90-¢ 1810-0 L8-0 1810-0 18-0 LLZ6-0 ad t 
& 0 OL‘ 0 02-4 290-0 oF-0 0 £0-0 — L818-0 8-1 € 
z 0 8L:% 0 82-6 LIST-O 80-0 0 ZF-0— 8089-0 a1 z 
I 0 PL:I 0 ¥2:3 0 13-0- 0 IL-0- 6684-0 9-0 I 
0 0 0 0 03-0 0 0 0 08-0 — 0 0 0 
(21) (11) (01) (6) (8) (2) (9) (g) (+) (g) (3) (1) 
yo-yno yo-4yno jyo-yno yo-yno 
puofeq *0°n+D puofeq |°0°n+ $+p| puoteq °o°n—pv puofeq |°0°n--}—pv 
SULIO4 yo-yng SULIO4 yo-ang SULIO4 yo-4yng SULIO} yo-4ng 
jo ung jo umg jo ung jo ung 
4 a7) D 4 
Z poyweW I poy Z poy. T poyveyy 
£9ZE-Z = "°°n ‘10-0 = 9 :°,7 10} STTRIEG 6PPO-T = 8°n ‘cQ-0 = 9 2-7 JOJ SIBIO 


























ZI = & ‘RI = wu 2en0 sof sjaan aounoifiubig “fF 9[qe], 








E. S. Pzarson 155 


and the results are shown in Table 5, cols. (6) and (7).. In this case, the actual chances of 
(a, 6) falling on or beyond the significance levels are even further below the nominal limits 
of 0-05 and 0-01. In fact, it becomes clear that in the case of small samples, at any rate, this 
method of introducing the normal approximation gives such an overestimate of the true 
chances of falling beyond a contour as to be almost valueless. 


Table 5. Showing the difference between nominal and actual significance levels 
































Ist example: m = 18, n = 12 2nd example: m= 10=n 
Method 1 Method 2 Method 1 Method 2 
CH if H 
cat True chance of True chance of True chance of True chance of pean, 
falling on or falling on or falling on or falling on or 
beyond beyond beyond beyond 
Lo.05 D'o.01 Lo.05 L’o.01 Lo.05 D’o.01 Lo.05 D’o.01 
(1) (2) (3) (4) (5) (6) (7) | (8) (9) (10) 

0-05 0-0010 0-0000 0-0478 0-0000 0-0000 0-0000 0-0069 0-0000 0-05 
0-1 0-0054 0-0000 0-0602 0-0003 0-0005 0-0000 0-0251 0-0005 0-1 


0-2 0-0141 0-0003 0-0483 0-0043 0-0037 0-0007 0-0455 0-0037 0-2 
0-3 0-0174 0-0012 0-0490 0-0091 0-0058 0-0014 0-0495 0-0058 0-3 
0-4 0-0204 0-023 0-0542 0-0108 0-0062 0-0017 0-0546 0-0062 0-4 











0-5 0-0219 0-0028 0-0498 | 0-0109 0-0062 0-0015 0-0572 0-0062 0-5 





0-6 0-0221 0-0035 0-0437 0-0119. Repeat as for 1—p 0-6 
0-7 0-0204 0-0037 0-0431 0-0120 0-7 
0-8 0-0126 0-0031 0-0459 0-0113 0-8 
0-9 0-0019 0-0009 0-0282 0-0052 0-9 
0-95 0-0001 0-0001 0-0058 0-0010 0-95 





























39. Before considering a second method, it will be useful to recapitulate certain character- 
istics of what I have termed Method 1. It provides for any nominal value of e one systematic 
procedure of defining a critical boundary. or significance level cutting off a region from the 
lattice diagram. Neither the subgroup of points cut off, nor the sum of the probabilities 
associated with them for a given 7, will alter continuously with e; they will change by discrete 
steps as the cut-off point, defined in para. 37, passes through a point (a,6). While we shall 
sometimes want to know whether the observed (a,b) falls beyond a level L, specified in 
advance, more often we shall ask what is the level on which (a, 6) falls. This, using Method 1, 
we find by calculating 

os ER if a<@ or wotaict if a>a, (18) 
oq Cy 
and finding ¢ from the normal integral of equation (13). In this way the nominal chance ¢ 
will be a little nearer the true upper limit than the figures in Table 5 suggest,* but not enough 
to modify the criticism expressed above. 


’* Tt will be seen from Table 4 that no point (a, 6) gives a f, in cols. (5) and (9) of exactly 0-05 or 0-01, 
respectively, so that no points actually lie on Log, or Loo. 








156 Choice of statistical tests 


40. Method 2.. The introduction of the correction of } for continuity is certainly ap- 
propriate in using the normal approximation to the hypergeometric series in Problem I, 
but I think it is not helpful in Problem II where we are concerned with a 2-dimensional 
experimental probability set. If instead of obtaining significance levels L, and Lj as in 
paras. 35-37, we obtain them from inequalities similar to (14) and (15) but with the correc- 
tion of 4 omitted, then there are several points to be noted: 

(a) For the significance level L,, the expression 


B, = X[P{a| N,r, m}], (19) 


where the summation is for values of a on the diagonal, r = constant, for which 
aca, =a-u,xo, (20) 


will be sometimes less and sometimes greater than e. Hence, in the balance, it seems likely 
that the chance of the point (a,b) lying beyond L, or 


NPN! : 
=| eter «2, | (21) 


will lie closer to ¢ than when the } correction is used. The position will be the same for Li. 
(6) In drawing repeated samples of m and » from two populations in which there is a 
common chance, p, of an individual possessing character A, the ratio 


ag aq@_ _a-rm |N 
—, mmnrs 
N*(N—1) 
has, whatever be p, (i) an expectation of zero, (ii) a unit standard deviation.* The shape of 
the distribution will, of course, depend on p, but, faut de mieux, we may not in the long run 
do too badly by assuming it to be normal. It is, of course, the weighted combination of a 
number of hypergeometric series whose shape depends on r. 





(22) 


41. Consider the result of applying this Method 2 to the case m = 18, n = 12 already 
discussed. The procedure for determining the 0-05 and 0-01 significance levels will be exactly 
as under Method 1, except that the continuity correction of } is omitted. The resulting levels 
are shown as dashed, stepped lines in Fig. 4.+ They fall, on the whole, inside the significance 
levels obtained by Method 1. Now turn to Table 4, where cols. (6) and (10) show the cut-off 
points a half unit further in towards the diagonal a/m = b/n. Cols. (7) and (11) give the values 
of £,; some of these are considerably above the nominal values of ¢ = 0-05 and 0-01, others 
are still well below. But from the approach to Problem II that has been adopted, this is 
immaterial since the experimental probability set is the 2-dimensioned one of the lattice 
diagram and is not restricted to the diagonal r = constant on which the observed point 
(a, 6) may happen to lie. What we are concerned with is the summed chance given by expres- 


sion (21) and the value of this is given for eleven values of p in cols. (4) and (5) of Table 5. It . 


will be seen that this true chance does sometimes exceed the nominal values of 0-05 and 0-01, 


* Provided cases where r or s are zero, making the expression (22) indeterminate with u=0/0, 
are excluded. Mr Barnard has pointed out that one way of avoiding this exclusion would be to lay 
down that, when u=0/0, we assign to the ratio a value chosen at random from a population (say 
normal) with zero mean and unit variance. 


t Again, for convenience the 5 % level is drawn above and the 1 % level below the diagonal. 








wo 


In 


wo 
co! 


Su 


ot @® 














— 





E. S. Prarson 157 


but never by very much. Again, for the «econd example with m = 10 = n (Table 5, cols. 
(8) and (9)) the true chance, while it sometimes exceeds the nominal value, is always con- 
siderably nearer it than using the significance levels of Method 1. 


42. It is clear that no final conclusions can be based on two numerical examples, but it 
seems that the test of the null hypothesis in Problem II should be carried out as follows: 

(a) When m, n, r or s are small, with the help of tables prepared on Barnard’s lines, based 
on an ordered classification of the points in the lattice diagram, and giving the true upper 
bound of the chance that a point (a, b) falls on or beyond the level on which the observed 
result lies. The particular basis of his classification may, of course, be modified. 


(6) When m, n, r and s are large, by assuming that the wu of equation (22) is a normal 
deviate with unit standard deviation. 


(vii) THE CLASSICAL APPROACH TO PROBLEM IT 
43. It has recently become customary to regard the test of significance applied to data 
given in a 2 x 2 table as the limiting case of a x? test with one degree of freedom. But Problem 
II was originally answered in somewhat different terms. It was noted that if 
P(A) = p,(A) = p, (23) 
then the fractions a/m and b/n would both have expectations of p and variances of p(1 — p)/m 
and p(1—p)/n, respectively. Hence, if the null hypothesis were true, the difference 


és (24) 


m n 
would have mean d = 0 


oe |[pa-n(2+3)] : (25) 


In large samples, therefore, it might be expected that 

ad a/m—b/n 

oq i p(1—p)(1/m+1/n)] 
would be approximately normally distributed. Since by the nature of the problem the 
common value of p was unknown, an estimate was made from the sample, namely; 





(26) 





P= mtn N @7) 
Substituting this into equation (26), we have 
d_ a/m—b/n (28-1) 
8a vi(r/N)(1—r/N) (1/m+ 1/n)) 
= a—rm/N (28-2) 


44, The form (28-2) is easily derived from (28-1); if we remember that b = r—a,s = N—r 
and m+n = N.* It is seen that the ratio d/s, is identical with the ratio u of equation (22), 
except for a factor ,/[(N —1)/N] which is unimportant in large samples. Thus the classical 
test is practically identical with that suggested in paras. 40-42 above, though the two tests 
are differently derived. 


* A third alternative form is, of course, (ad —bc) / N/,/(mnrs). 





158 Choice of statistical tests 


(viii) Prosiem IIT 


45. This may be described as the test for the independence of two characters A and B. 
It is supposed that the probability that an individual selected at random will possess 
character A is p(A) and that he will not possess it is p(A) = 1—p(A). The corresponding 
probabilities for character B are p(B) and p(B) = 1—p(B). Four alternative combinations 
of the characters may occur, which may be denoted by AB, AB, AB and AB. The various 
probabilities are set out in Table 6A. If the null hypothesis, H,, specifying the independence 
of A and B is true, then 


p(AB) = p(A)x p(B), p(AB) = p(A) p(B), ete. (29) 
To test the hypothesis, we have a random sample of N observations with frequencies of 
occurrence of the combinations AB, AB, etc., which may be classified in the 2 x 2 scheme of 
Table 6B. The sampling conditions are such that the probabilities of Table 6 A are the same 
for all individuals selected, or, in conventional terms, the sample is drawn from an infinite 
population. Barnard calls this problem that of the double dichotomy. 




















Table 6A. Probabilities Table 6B. Sample data 
A A Total A A Total 
B p(AB) p(AB) p(B) B a c m 
: B p(AB) p(AB) p(B) B b d n 
Total p(A) p(A) 1 Total + 8 N 



































46. In Problem III there is only one application of a random process, the selection of N 
individuals, each one of which must fall into one or other of four alternative categories. If 
the random process were repeated and another sample of N drawn, not only are the fre- 
quencies a, b, c and d free to vary, but also both marginal totals, i.e. m may change as well 
as r. The experimental probability set will therefore contain results (a,b,c, d) restricted by 
the conditions (i) that none of the frequencies can be negative and (ii) that 


a+b+c+d=N. (30) 


Geometritally, as Barnard points out, the set can be represented in 3 dimensions by points 
at unit intervals within a tetrahedron obtained by placing on top of one another the series 
of 2-dimensioned lattices of dimensions 


Oxn, lx(nm—1), 2x(m—2), ..., (m—1)x1, mx0. (31) 
47. We are again testing a composite hypothesis and should like to determine a family of 


critical surfaces to bé used as significance levels, dividing the points within the tetrahedron 


in such a way that the chance of the sample point (a, b,c,d)* lying outside a given surface 
L, is equal to e, whatever the values of the unknown probabilities p(A) and p(B). But 
again, as in Problem II, owing to the discontinuity in the set of points, there are no ‘similar 


* In view of the condition (30), the point can be defined by three co-ordinates, e.g. as (a, b, c), 
(a, 6, m) or (a, r, m). In view of the form of equation (32), the last system of co-ordinates will be used. 








reg 
mu 


= 3 


me - 3 of 


i ed el i ee, ee. ee. ae, | |e 








2S Ping 





vo Spray 











159 


regions’. We note that if H, is true, the probability of the observed result is a term of the 
multinomial expansion, viz. 


E. S. Pearson 


14 


rbratgiP(4 BY" P(A By p(A By p(B)" 
N! a a 
= albicigi PCA) p(B)" p( A+? p(B YP +4 
N! N! Intra! 
= — p(B) (1 — p(B)" 5 ptAY (1— P(A)! x Fe 
= P,{m | p(B), N} x P,{r | p(A), N} x P{a| N,r, m}. (32) 


Here, the notation of para. 30 has been repeated. 

48. Thus the probability of obtaining a sample repr-sented by the triplet (a,r,m) may be 
regarded, if the characters A and B are independent, as the product of three terms: 

(i) The probability of drawing m individuals with character B in a random sample of N, 
i.e. the probability that (a,r,m) falls in a horizontal section of the tetrahedron on which 
m = constant. This is the (m+ 1)th term in the expansion of the binomial 


{(1—p(B)) + p(B)p. 
(ii) The probability of drawing r individuals with character A in a random sample of N, 
ie. the probability that (a,r,m) falls on the vertical section of the tetrahedron on which 
r = constant. This is the (~ + 1)th term in the expansion of 


{(l—p(A)) + p(A)P. 

(iii) The probability, given m and r, of the observed partition within the 2 x 2 table. This 
term represents the relative probability associated with the points lying along a straight line 
m = constant, r = constant; it is, of course, the same expression as has arisen in Problems 
I and II and is proportional to a term in the hypergeometric series F(—r, —m,n—r+1, 1). 

49. We are faced with a situation similar to that met under Problem II. Were it possible 
to cut off from each line on which m = constant, r = constant, a group of points such that 


> [A{a| N,r,m}] =e, (33) 


then the subset of points within the tetrahedron composed of the sum of these groups for 
all possible combinations of m and r would have the property required of a ‘critical region’ 
in a significance test: i.e. the chance that the point (a,r, m) is included in the region, if H, 
is true, would be ¢ whatever values the irrelevant probabilities p(A) and p(B) assumed. 
However, (33) cannot be satisfied in general, and all that is possible is to define a family of 
significance contours such that the chance of a sample point falling beyond any one of them, 
say L,, is <e. By using the normal approximation to the sum of the hypergeometric tail- 
terms with the correction for continuity as described in paras. 35-39 for Problem II, we shall 
be very much on the safe side, i.e. the formal level of ¢ is likely to be much above the true 
chance of falling beyond the level, whatever be p(A) or p(B). The presence of the two binomial 
terms in equation (32) instead of the single term in equation (8-3), makes it likely that the 
overestimation of ¢ will be greater in Problem III than in II. It is to be expected, therefore, 
that any any rate when neither m, n, r or s are too small, the better approximation will be 
obtained by referring the u of equation (22) to the normal probability scale. 

50. The handling of Problem III is discussed briefly by Barnard on p. 136 above. 
There is clearly room for further investigation. The general nature of the approximation 








160 Choice of statistical tests 


involved is of course that which arises in every x? test for goodness of fit or for independence 
in an hx k table, where we replace a distribution consisting of a finite set of probabilities at 
discrete points in multiple space by a continuous distribution for which integration outside 
ellipsoidal contours is straightforward. 


(ix) GENERAL COMMENT 


51. The duties of the statistician lie at many levels. He may be required merely to apply 
an established technique of analysis to an assembly of numerica:i data and this application 
may result in a statement, based on probability theory, of a ‘level of significance’ or a 
‘confidence interval’, which will be used by others. Or he may be called on to share in 
planning the investigation or experiment which is to provide the data and then to draw 
conclusions from their analysis which will lead to further action. In this final role he needs 
to bring into play faculties which are no monopoly of his calling, the qualities of sound 
judgement which are the characteristics of a well trained, scientific mind. In the weighing 
of evidence, the result of the statistical analysis, expressed in one or more conventional 
probability figures, is only one factor in the summing up; as important, may be, is the 
question of whether the mathematical model is a fair counterpart to the happenings in the 
observational field. In addition, there will offen be much information coming from outside 


the range of the immediate investigation, yet hardly expressible in numerical terms, which 
must influence decision. 


52. It is perhaps hard experience gained in certain fields of war-time research, where 
decisions had to be reached on statistical data far less ample than could be wished, which has 
forced my own attention to this question: What weight do we actually give to the precise 
value of a probability measure when reaching decisions of first importance? One subject for 
examination falling under this inquiry is clearly the logical basis of the reasoning process 
by which judgement is influenced as a result of the application of a test of significance. This 
was the theme on which this paper opened. The approach illustrated in the pages which 
followed is a personal one and is set down, with no claim to be the best, in order to provoke 
thought and discussion. There appears no short route to a right answer in this matter; each 
individual who hopes to use his own judgement to the full in drawing conclusions from the 


statistical analysis of sampling data, must decide for himself what he requires of probability 
theory. 


53. In the approach which I have followed and illustrated on the analysis of data classed 
in a 2x 2 table, the appropriate probability set-up is defined by the nature of the random 
process actually used in the collection of the data. Consideration of this point forms the 
initial step in the determination of the appropriate test. On this score, what I have termed 
Problems I, II and III are differentiated. The difference is fundamental and lies at the 
bottom of the dilemma to which the Barnard-Fisher correspondence in Nature drew atten- 


tion. It can be illustrated on the following data, given in Table 7, where I shall suppose that 


the effect we are interested in is that making a significantly greater than b. 

54. If (a) the results have been obtained by random assignment of Treatment 1 to 
eighteen out of thirty individuals and Treatment 2 to the remaining twelve, and 

(6) we merely ask whether the results are consistent with the hypothesis that the treat- 
ments are equivalent as far as these thirty individuals are concerned, so that the difference 
between the proportions 15/18 and 5/12 may reasonably be ascribed to a chance fluctuation, 




















ne Cfinimaaner 





i a ts 


SS 





E. 8. Pearson 161 


(c) we are then concerned with Problem I, i.e. simply with the probabilities associated 
with the points (a, 20—a) on the diagonal r = 20 of Fig. 4. The chance of getting a> 15, if 
the null hypothesis is true, is 0-0241,* or, using a common phrase, we can speak of the result 
being significant at the 2-5 % level. 


55. On the other hand, if a sample of 18 has been drawn randomly from one population 
and a sample of 12 independently from a second and we wish to test whether p,(A) = p,(A), 
then it seems to be an artificial procedure to restrict the experimental probability set to 
the 11 points on the line r = 20, i.e. to the values of a: 8, 9, ..., 18. A repetition of the double 
sampling process could give us a result (a,b) falling at any of the 19 x 13 = 247 points in 
the lattice diagram of Fig. 4. There will be a number of ways of defining a family of signi- 
ficance levels for this 2-dimensioned set; if we adopt that discussed in paras. 40-41, which 





























Table 7 
Frequency of results 
For problem I For problem IT Total 
A A 

lst treatment Sample from Ist population a=15 c=3 m= 18 
2nd treatment Sample from 2nd population b=5 d=17 n= 12 

Total r= 20 s=10 N = 30 

ol 








gives as two of its members the dotted, stepped lines shown in Fig. 4, we can say that the 
chance of a result falling beyond the lower line is certainly less than 0-015.+ The observed 
point, with a = 15, 6 = 5 falls beyond the line, so that the result is undoubtedly ‘significant 
at the 1-5 % level’. 


56. These two probabilities, 2-5 and 1-5 %, are not the same, but there is no inconsistency 
in their difference. The character of the two investigations is different and to treat Problem II 
as though it were Problem I seems to call for a probability set-up which is unnecessarily 
artificial, when a simpler one is available. Admittedly by getting what seems to me a closer 
relation between the probability set-up and the experimental procedure, we have sacrificed 
some simplicity in handling the 2 x 2 table. But this is only the case when dealing with 
small numbers. For large numbers the methods of handling Problems I, II and III become, 
practically, identical. 


57. Consider again the heavy shell problem described in para. 7 above. If we are to intro- 
duce probability theory, it seems to me that we should regard the problem as one in which 
we have a se:ple of m = 12 from the possible output of shell made to one design or by one 
firm and of n = 8 from the possible output of a second. This sampling may be hypothetical 
in that these may be ‘pilot’ shell, the first off production; nevertheless, this construct is 

* For the normal curve approximation, using the correction for continuity, we find 

u = (15—}—12-0)/1-2865 = 1-943. 
The proportionate area under the normal curve beyond this deviation is 0-026. 
t Table 5, col. (5) shows the largest value of this chance to be 0-0120 for p = 0-3. This figure cannot 


be much exceeded for other p’s though I have not determined the precise maximum. I give 0-015 as 
a safe-side limit. 


Biometrika 34 II 











162 Choice of statistical tests 


clearly less artificial than one in which, on the null hypothesis, we regard the experiment as 
though it were made on twenty shells, to twelve of which has been randomly assigned the 
label ‘Made by firm X’ and to the other eight, ‘Made by firm Y’. 


58. It is clear that in the heavy shell problem there may be many reasons to doubt 
whether the rounds fired can be regarded as a random sample from future output. That is 
why I have emphasized that the exploration which the statistician makes in private will not 
necessarily be presented in figures at the conference table. In this example, the proportions 
of successful perforations were 2/12 and 5/8; these put us on the line, r = 7, of the lattice 
diagram for which the hypergeometric probabilities were shown in Fig. 2. The sum of the 
terms with a<2 is 5-2 % (normal approximation, using the }-correction, 5-6 %). This is 
the chance of getting as great or a greater positive difference, b—a, if H, were true, treating 
the case as Problem I. Barnard’s method has not yet been extended to cover this case, 
but if we were to use the large sample method for handling Problem II, described in my 
paras. 40-41, we should find from equation (22) that 


u = (2—4-2)/1-072 = — 2-05, 
which puts (a,b) outside the upper ° 5 % level. 


59. Were the action taken to be decided automatically by the side of the 5 % level on 
which the observation point fell, it is clear that the method of analysis used would here be 
of vital importance. But no responsible statistician, faced with an investigation of this 
character, would follow an automatic probability rule. The result of either approach would 
raise considerable doubts as to whether the performance of the first type of shell was as good 
as that of the second, but without the whole background of the investigation it is impossible 
to say what the statistician’s recommendation as to further action would be. 


60. In the example of the proof of anti-tank shot discussed in para. 6, the chance of 
perforation, p, while varying from plate to plate and batch to batch, will almost certainly 
not range through the whole interval 0-1. The striking-velocity of the shot would also 
probably be adjusted s. “at for average proof-plate and batches, p was near }. Then the 
discriminating level (or ivvels*) set across the 13 x 13 lattice diagram would be fixed paying 
regard to the likely variation in p; thus a fairly close upper limit could be calculated to the 
true probability of (a, 6) falling beyond the level if the ffesh batch were of the same quality 
as the standard. This is the upper limit of the risk of segregating the batch wrongly. 


61. Precisely similar problems arise for consideration in even more difficult form in the 
analysis of data arranged in ah x k table, where h or k or both are > 2. It has become common 
practice to speak of the solution of this problem in terms of ‘fixed marginal totals’, but it 
may be questioned whether the restriction in the experimental probability set implied is 
generally appropriate. The frequencies in a h x k teble may have been obtained by many 
different sampling procedures for, as in the 2 x 2 problem, a single form of tabular presentation 
will follow from a variety of types ofinvestigation. For most of these, arepetition of therandom 
process of selection would give results with either one or both sets of marginal totals changed: 


62. For convenience in solution we may, of course, start by considering the distribution 
of our test criterion, on the null hypothesis, within the sub-set of results for which the margins 


* It is possible that two levels might be taken with the associated proof rules: (i) if (a, b) falls beyond 


the outer one, reject the batch; (ii) if between outer and inner, fire further rounds; (iii) if within the 
inner level, accept the batch, 
































E. S. Pearson 163 


are fixed. If this distribution were the same whatever these fixed values, then the overall, 
distribution for unrestricted sampling would be the same as that for variation subject to 
fixed margins. Thus, mathematically, the sclution of the partial problem would be a step in 
the solution of the complete one. But when applying x? analysis to an h x k table, this result 
is only true as a large-sample approximation. 


63. If we use the mathematical-model which it is suggested gives the most direct aid in 
reasoning from the observations, i.e. that which regards the experimental probability set 
as generated by a repetition of the random process of selection used in.collecting the data, 
then in the majority of cases we cannot regard the marginal totals as fixed. Thus a rigorous 
treatment would lead, as in the case of the 2 x 2 table, to a differentiation into a number of 
solutions. It is to be hoped, however,* unless the numbers in the margins are very small, 
that the x? approximation with its appropriate degrees of freedomf{ will give results which 
are not misleading. This approximation leads, of course, in the 2 x 2 table to the reference 
of the ratio u of equation (22) to the normal probability scale. Some aspects of the approxi- 
mation in this more general case were discussed by Yates (1934, pp. 233-35). 


64. In closing I should like again to acknowledge my indebtedness to Mr G. A. Barnard. 
Having had the good fortune to discuss these problems with him and see drafts of his work 
over a period of 2 or 3 years it is difficult to say how many of his ideas have been built un- 
consciously into my own earlier approach. But I am especially aware of the clarification 
which his emphasis on the distinction between Problems I, IT and III brought to my survey. 
Iam also very grateful to Mr M. G. Kendall, Dr R. C. Geary and Dr B. L. Welch for a number 
of helpful criticisms, and to Mrs Maxine Merrington for her extensive computing work, which 
has alone made possible the various numerical illustrations that I have given. 


* From the point of view both of the exponents of the fixed marginal aiid unrestricted marginal 
approach. 

+ The statement that, for example, in applying the test of independence of two characters to an 
hxk table, the degrees of freedom are (h—1) x (k—1), does not of course mean that sampling is re- 
stricted by fixed marginal totals. All that is implied is that approximately the overall distribution of 
the x? function of the observations used, is the same as that for sampling within the restricted sub-set; 
this is because the distribution within each sub-set is approximately independent of the particular 
marginal totals which define it. 


REFERENCES 


BaRnarD, G. A. (1945a). Nature, Lond., 156, 177. 

BaRNARD, G. A. (19456). Nature, Lond., 156, 783. 

FisHer, R. A. (1935). J. Roy. Statist. Soc. 98, 39. 

FisHer, R. A. (1941). Science, 94, 210. 

FisHer, R. A. (1945a). Nature, Lond., 156, 388. 

FisHER, R. A. (19456). Sankhyé, 7, 130. 

Kenpatt, M. G. (1943). The Advanced Theory of Statistics, 1. London: 
Charles Griffin and Co. Ltd. 

Neryman, J. & Pearson, E. S. (1928a). Biometrika, 20A, 195. 

Neyman, J. & Pearson, E. S. (19286). Biometrika, 20A, 263. 

Neyman, J. & Pearson, E. S. (1933). Philos. Trans. A, 231, 289. 

Neyman, J. & Pearson, E. S. (1936). Statist. Res. Mem..1, 113. 

Neryman, J. & Pearson, E. S. (1938). Statist. Res. Mem. 2, 25. 

Pearson, K. (1899). Phil. Mag. 47, 236. 

PrzyBorowskI, J. & WILENSE!, H. (1939). Biometrika, 13, 313. 

Wuson, E. B. (1941): Science, 93, 557. 

Wrson, E. B. (1942). Proc. Nat. Acad. Sci., Wash., 28, 94. 

Yares, F. (1934). J. Roy. Statist. Soc. Suppl. 1, 217. 








[ 164 -] 


APPENDIX 


THE NORMAL CURVE APPROXIMATION IN PROBLEM I 


1. The following Tables 8 and 9 (A), (B) and (C) show the order of accuracy which results 
from using the normal curve integral as an approximation to the tail sums in the series 
m'iniris! 
Pio | N, 1, m} = ered wl (34) 
the terms of which are proportional to those in the hypergeometric series 
F(-r, —m, N-—m—r+1, 1). 
Here a is a variable which can assume the range of positive, integral values indicated 
under (i), (ii) and (iii) in para. 20 above, while N, r and m are fixed. The relation 
between these quantities and 6, c, d, n and s is given in Table 1, para. 17. The method of 
approximation, using the ‘}’ correction for continuity, has been discussed in para. 25. 


2. Table 8 takes the case of an equal partition, m = n = $N, and shows the sum of the 
terms in the expression (34) for which a>a, which is also the sum of terms for which 
a<r—a,. For m#n, results are given in Table 9 for m>vn and for the following pro- 
portionate partitions of N: 

(A) m=3N, n=3N; (B) m=4N, n=4N; (C) m=4N, n=7QN. 
Here sums of terms at both tails of the series are needed. The sums (or chances of a >a, 
or <a,) have not been given for all possible values of a, but, broadly speaking, for those 
within the limits where significance is likely to be in question. Sums below 0-0010 have 
generally been omitted. In each case the true sum of the terms (34) is compared with the 
approximation from the normal integral. 


3. In drawing conclusions from the comparison, we have to decide what degree of 
accuracy is called for. Clearly the normal integral does not give mathematically exact 
results to 4 decimal places. On the other hand, except for certain instances where the 
partition is very unequal (m =4N and §,N) and r is small, the order of the approximation 
may be said to follow that of the series closely. If decisions are made by rule of thumb, 
according to the side of the 5% or 1 % significance level on which a falls, then there are 
a number of entries in the tables where the approximation would give a on the wrong side. 
But one may question whether judgement of significance based on a single experiment can 
in fact be made sensitive to a difference between, say, 0-06 and 0-04 (odds of 16 to 1 and 
24 to 1) or k +tween 0-012 and 0-008 (odds of 82 to 1 and 124 to 1) and, given such latitude 
in accuracy, the approximation will be found generally sufficient. These must be points, 
however,. where personal opinions will differ. Whatever views are held, the tables are 
sufficiently extensive to make it possible to obtain from them a rough measure of the 
accuracy of approximation in a wide range of cases. 


4. It will be noted that in the symmetrical case (m = 4N) and also when m =3N the 
normal approximation for the tail sum is almost invariably a little too large. Undoubtedly 
for the symmetrical case an improved approximation could be obtained by modifying 
the } correction u d in calculating the ratio of deviation to standard deviation. This 
second order term would, however. need to vary with the probability level, thus com- 
plicating the procedure, 





30 


wwrhorre- | 


20 


Ee | 


| 





























Appendix 
Table 8. Case of equal partition, m=n=4N. Chance that a >a, =chance that a<cr—a, 


165 











‘ 


























_ msan= 50 m=n= 30 m=n= 20 m=n=15 m=n=10 
Normal Normal Normal Normal Normal 
r | a, True approx. True approx. True approx. True approx. True approx. a, r 
17 | 0-2566 | 0-2574 | 0-2194 | 0-2212 17 
18 | -1376 | -1388 | -0981 | -1002 18 
30 19 -0630 -0643 -0348 -0365 19 30 
20 -0243 -0253 0096 -0106 20 
} 21 -0078 -0085 -0020 -0024 21 
22 -0021 -0024 22 
; 12 | 0-2269 | 0-2278 | 0-2060 | 0-2076 | 0-1715 | 0-1745 12 
' 13 -1053 -1068 -O852 -0873 -0564 -0592 13 
20 | 14 -0392 -0408 -0270 -0287 0128 -0144 14 | 20 
15 “0114 -0126 -0064 -0073 -0019 “0025 15 
16 -0025 ‘0031 ‘0011 ‘0014 16 
9 | U-2884 | 0-2887 | 0-2760 | 0-2772 | 0-2572 | 0-2595 | 0-2330 | 0-2364 9 
10 -1312 +1325 -1163 “1185 -0954 -0985 0715 -0755 10 
715} Il -0453 -0473 -0358 -0380 ‘0242 -0265 0134 0156 11 15 
12 -O113 -0129 -0077 0090 -0040 -0049 0014 0020 12 
13 0019 -0027 “0011 0016 13 
7 | 0-1589 | 0-1599 | 0-1495 | O- 1514 0-1367 | 0-1397 | 0-1226 | 0-1266 | 0-0894 | 0-0955 7 
10 8 “0458 0486 -0399 0429 -0324 -0357 -0251 -0285 -O115 -0147 8 10 
9 -0078 ‘0101 -0061 0081 0042 0058 0026 -0038 -0005 “0011 9 
10 “0006 0014 “0004 -0010 10 
5 | 0-2179 | 0-2177 | 0-2119 | 0-2126 | 0-2038 | 0-2056 | 0-1950 | 0-1980 | 0-1749 | 0-1804 5 
7 6 -0558 -0594 0514 -0553 -0458 -0501 0401 -0448 -0286 -0338 6 7 
7 -0062 -0096 -0053 “0084 -0042 -0068 -0032 -0055 0015 0031 7 
5 4 | 0-1810 | 0-1806 | 0-1766 | 0-1771 | 0-1709 | 0-1735 | 0-1648 | 0-1677 | 0-1517 | 0-1571 4 6 
5 -0281 -0339 0261 -0320 0236 0295 0211 -0270 -0163 -0220 5 






























































166 


Appendix 


Table 9. Case of unequal partition. Chances that a<a, and a>a, 
(A) m=3N,n=3N 



































PP m = 24, =z 12, 
Partition n=16 = 
| 
Chance} Normal] Normal Chance 7 
that approx. approx. 
asa 0-0050 wR 
-0329 
-1348 
30 30| 
0-1348 
-0329 
42a, aad a>aq, 
' 
0-0021 
a<a, -0128 aga, 
0555 
-1695 
20 20 
0-1695 
-0555 
a>a, -0128 a>a, 
-0021 
0-0015 
-0106 
ia <a, | 0499 aga, 
L -1618 0-0616 
15 15 
0-1618 0-0616 
-0499 0051 
a>a, -0106 azaq, 
-0015 
0-0050 0-0009 i 
a<a, * 0329 0131 a<a, 
pen +1348 -0910 
10 
0-1348 0-0910 \ 
ja>a, “0270 | -0329 ‘0131 a>a, 
. -0050 -0009 
0 0-0010 0 
ia<a,i| 1 ‘0118 0-0059 | 1 Jace, 
. 2 0756 | -0770 0564 | 2 
1 
6 0-1378 0-1127 | 6 
a>a, | 7 0186 | -0269 0-0160 | 7 } OM 
0-0080 0:0051 
0742 0616 itn es, 
5 
0-0742 0-0616 a2>a, 
















































































30 


20 


15 











Appendix 167 


Table 9 (continued) 
































































































































— (B) m=#N,n=4N (C) m=N, n=75N 
: Partition m=80,n=20 | m=48,n=12 m=32,n=8 Partition m = 90, n = 10 
— 
| , || 
r Chance Normal Normal Normal Chance Normal 
| | r | that. | % | True | approx.| TT° |approx.| T° | approx. ’ | "that | % | Tre | approx. 
18 | 0-0018 | 0-014 22 | 0-0009 | 0-0006 
19 | -0084 | -0073 | 0-0013 | 0-0020 a<a,/| 23 | °0073 | -0057 
a, a<a,{| 20 | -0306 | -0288 | -0106 | -0125 Lee, | 24 | -0388 | -0352 
21 | -0884 | -0874 | -0521 | -0548 25 | -1384 | -1388 
22 | -2046 | -2078 | -1667 | -1685 30 
301} | 30 By ee ns 29 | 0-1356 | .0-1388 
“2092 | 0-2078 | 0-1667 | 0-1685 30 | -0229 | -0352 
27 | -0824 | -0874 | -0521 | -0548 
a>a,+| 28 | -0227| -0288 | -0106 | -0125 
ay 29 -0039 | -0073 | -0013 | -0020 14 | 0-0039 | 0-0019 
30 | -0003 | -0014 a<a,| 15 | -0254 = 
16 | -1095 
20 
11 | 0-0040 | 0-0026 | 0-0013 | 0-0011 | — - a>a, | 20 | 0-0951 | 0-1068 
a<a,/| 12 | -0182 | -0148 | -0095 | .-0087 | 0-0016 | 0-0031 
— | 13 | -0638 | -0600 | -0460 | -0448 | -0218 | -0255 
” ile 14 | -1729 | -1755 | -1523 | -1542 | -1176 | -1208 9 |. 0-0006 | 0-0001 
1 20 10 | -0063 | -0027 
18 | 0-1758 | 0-1755 | 0-1522 | 0-1542 | 0-1176 | 0-1208 @<4%)/ 11 | .0408 | -0316 
> 
ol @>4,/| 19 | -0499 | -0600 | -0371 | -0448 | -0218 | -0255 12 | -1705 | -1765 
20 | -0066 | -0148 | -0041 | -0087 | -0016 | -0031 16 
a>a, | 15 | 0-1808 | 0-1765 
a. 
. 7 | 0-0018 | 0-0009 | 0-0008 | 0-0004 
a<a,{| 8 | -0107 | -0074 | -0064 | -0049 | 0-0022 | 0-0024 5 | 0-0006 | 0-0001 
9 | -0462 | -0408 | -0355 | -0323 | -0217| -0219 6 | -0082 | -0029 
—|— 10 | -1470 | -1480 | -1329 | -1338] -1115 | -1133 4<%)| 7] -0600 | -0486 
15 8 | -2615 | -2902 
aa, {| 14 | 91453 | 0-1480 | 0-1294 | 0-1338 | 0-1079 | 0-1133 10 
a, ="1|1 15 | 0262 | -0408 | -0206 | -0323 | -0141 | -0219 a>a, | 10 | 03305 | 0-2902 
15 
{| 4 | 00039 | 0-0019 | 0-0026 | 0-0013 | 0-0012 | 0-0008 3 | 0-0016 | 0-0003 
a<a,'| 5 | -0254| -o191 | -0206| -0159| -0145 | -0121 a<a,{ 4 | 0207 | -0096 
ay ) | 10 \} 6 | -1095 | -1068 | -1012 | -o9ss | -os93 | -08s2 5 | -1442 | -1492 
| 7 
a>a, | 10 | 0-0951 | 0-1068 | 0-0868 | 0-0988 | 0-0761 | 0-0882 a>a, | 7 | 0-4667 | 0-3974 
| -— : 
a | {| 2 | 00033 | 0-0013 | 0-0024 | 0-0010 | 0-0015 | 0-0007 2 | 0-0067 | 0-0006 
| a<a,}| 3| -0282 | -0203 | -0246 | -0181 | -0201 | -0155 see, { 3 | -0769 | -0538 
o}| | 7 (| 4| -1408 | -1417] -1354 | -1364 | -1281 | -1293 5 
asa; | 5 | 0-4163 | 0-5000 
~ a>a, | 7 | 0-1985 | -0-1910 | 0-1906 | 0-1848 | 0-1805 | 0-1776 
see 1 | 0-0053 | 0-0022 | 0-0045 | 0-0021 | 0-0035 | 0-0016 
| | 6|7S% |) 2] -0531 | -0434 | -0499 | -0430 | -0467 | -0383 
n a>a, | 5 | 0-3193 | 0-2841 | ¢.2135 | 0-2835 | 0-3060 | 0-2776 
1) | 
a i 
ay 
5 
a 























[ 168 ] 


2x2 TABLES. A NOTE ON E. S. PEARSON’S PAPER 
By G. A. BARNARD 


As Prof. Pearson has kindly shown me the proof of his paper, I should like to make the 
following further remarks. 


1. If we have a sample of N from a population in which there is a chance p that an 

individual will have a character A, we can represent it in the form 
a ee Seer 
where z, is 1 or 0 according as to whether the ith member has A or not.* Regarding the 
x’s as quantitative variables, we have by classical results the unbiased estimates 
P=2.=(2x,)/N and 6? = (L(x,—z.)*)/(N—1). 
If r of the x’s are 1, while s are 0, we find 
p=r/N and 6% =.rs/N(N-1). 

Using this unbiased estimate of variance in Prof. Pearson’s para. 43, we get, instead of his 


(28-2), d  a-—rm|N 


te mnrs ” (1) 
a 
| J N*{N—1) 
agreeing exactly with his (22). 
2. Té carry the argument further, in classical theory, if we have two samples 
(24, Lys -22y jy e005 Lm) ANA (Yq, Ya, -++y Yys +++ Yn) 


to test whether the samples come from the same normal population we take 





pa ty: |_mm 
Pe m+n’ 


where x. =(2z,)/m,y. = (y;)/n, and 


gt = 7-2. P+ 2(yj—y- (2) 
n+n—2 : 





and use tables of the ¢ distribution for (m+ — 2) degrses of freedom. 
It is common practice to neglect departures from normality in applying this test. If we 
do so, and apply it to our qualitative case along the lines indicated above, we get 


a—rm|N 
acn + bdm’ 
N(N —2) 
which, if we are justified in our neglect of departures from normality, should be distributed 
as t on (N —2) degrees of freedom. 


* For a similar argument see B. L. Welch*(1938, p. 155). 














aa 


fal 





16 


is 








G. A. BaRNaRD 169 


3. To obtain the formula (1) on these lines, we have in effect to commit the well-known 

fallacy of replacing s* as given by (2), by 
oe meee 
gt = 2% m') +2 (y; m') P (3) 
m+n—1 

where m! = (Lx,+ Ly;)/(m+n). 
We are led to ask why (3) should be approximately correct (and in fact it is better than (2)) 
in the qualitative case, while (2) is preferred in the quantitative case. 





4. The simplest reason for preferring (2) to (3) in the quantitative case is that s’* is not 
independent of (x. —y.), so that the conditions for validity of the ¢ distribution are not 
satisfied. In our qualitative case this argument loses validity, since neither s* nor s’* is 
independent of (7. —y.). 

The second reason for preferring (2) to (3) in the quantitative case is more complizated, 
but for our purposes it reduces essentially to the fact that, in the case of normal distributions, 
and only in this case, the mean and variance of samples are independently distributed, so 
that the common mean value of the populations, estimated by m’, is irrelevant to the test 


for differences. In our qualitative case, on the other hand, m’ contributes to our knowledge 
of the variance. 


5. If we apply Pitman’s ‘absolute’ analogue of the ¢ test fo our case, we arrive at the 
hypergeometric series of Prof. Pearson’s Problem I. But Bartlett’s argument, showing the 
convergence of Pitman’s test and thet test, will apply here only in very large samples, because 
of the finite probability of obtaining observed values which coincide. 


6. From the above point of view, Prof. Pearson’s analysis of his Problem II may be 
regarded in one sense as an examination of the effect of large departures from normality on 


the ¢ test. In this light, his conclusions given in paras. 51 and 52 are seen to extend to the 
t test, as well as to the 2 x 2 table problem. 


7. If I may state my personal attitude, it is that statistics is a branch of applied 
mathematics, like symbolic logic or hydrodynamics. Examination of foundations is 
desirable, but it must be remembered that undue emphasis on niceties is a disease to which 
persons with mathematical training are specially prone. In pure mathematics itself there are 
disputes on foundations which closely parallel the disputes over the foundations of statistics. 
The lesson to be drawn is, that while statistics is a most valuable aid to judgement, it cannot 
wholly replace it. 


8. Finally, it must be emphasized that the order of printing of Prof. Pearson’s paper and 
my own reflects Prof. Pearson’s generosity rather than the historical order of events. Much 
of his paper was, unknown to me, given in lectures before the war; whereas my work on the 
problem began only in 1943. Since then I have owed much both to Prof. Pearson’s published 
work and to discussions which I have been privileged to have with him. 


REFERENCE 
WELCH, B. L. (1938). Biometrika, 30, 155. 














[ 170 ] 


THE CUMULANTS OF THE Z AND OF THE LOGARITHMIC ;? 
AND ¢ DISTRIBUTIONS 


By JOHN WISHART 
School of Agriculture, Cambridge 


Explicit expressions for the exact cumulants of Fisher’s z-distribution do not appear ever 
to have been published. They were therefore worked out, and appear in §2 of this paper. 
It afterwards appeared that the logical method of presentation was to deal with the similar 
problem for } log (x*/n),* since the z-distribution involves the simple difference of two such 
functions which are independent. This led to § 1. Since writing this paper, Bartlett & Kendall 
(1946) have published the same result in the form of the cumulants of log s?, and have given 
graphical and tabular representations for varying n up to 20. The solution is, of course, 
implicit in Cornish & Fisher’s (1937) statement of the moment generating function, while 
Mr C. R. Rao has informed me that he reached the same result in work done for an M.A. 
Thesis of the University of Calcutta (unpublished). §1 has accordingly been shortened, 
but is retained in view of the additional formulae to those of Bartlett and Kendall. 


1. THE LOGARITHMIC y? DISTRIBUTION 
The distribution of x*, for n degrees of freedom, is given by 


1 
T@n) (4x*)# 1 e-0# ($2). 
As pointed out by Cornish & Fisher (1937), the mean value of exp {}it log (x2/n)} 
i.e. of exp {}it log ($x*) — fit log (4n)} 
is the moment generating function of the distribution of 4 log (x?/n), namely 


rT’ } cdl 
ut = exp {-titlog (4n)}. 


The cumulant generating function is 


K = log M = — hit log (4n) + log '}.(n + it) — log I(4n). 
The cumulants of the distribution of } log (x?/n) are readily written down by differentiating 
K successively with respect to it and at each stage putting t = 0. We have in favt 


ky = — tlog (dn) + F log Pn) 


1 1 
ie —5{loga+ Lt +, (<e,a)--)} (1) 
and K, = ee ee) (¢>1,a = 4n), 
where {(s,a) denotes the generalized Zeta-function 
rs 1 
£(8,a) = Pet" 


* All logarithms in this paper are to base e. 











1) 








JoHN WISHART 171 


The cumulants may be readily computed by throwing them into the form 
2x, = (4) — log (}n), 
2K, = Y*-Y(4n), (2) 

where w(x) = d{logI'(x)}/dx, y*-Y(x) = d*{log I'(x)}/das. 
y(x) is variously called the Psi or Digamma function, and its derivatives have been called 
the Trigamma, Tetragamma, etc. Functions, and the series the Polygamma Functions. 
These functions have been computed in some considerable detail. For n up to 22 the mean 
and variance can be got from Elinor Pairman’s ‘Tables of the Digamma and Trigamma 
functions’ (1919). Tables up to Pentagamma appear in Vol. 1 of the British Association’s 
Mathematical Tables (1931), but with certain gaps which, although intended to be bridged 
by reduction formulae, render the tables less generally useful (for m less than 22) than 
H. T. Davis’s Tables (1933, 1935). Table 10 of Vol. 1 gives all that is required for y(x);.in 
Vol. 1, Tables 14-16, 18-20, 22-24 and 26-28 cover a wide range up to Hexagamma. 

As shown by Bartlett & Kendall (1946), the approach to normality is very slow. For 
n = 24 (the limit for n, of the z table of Fisher & Yates (1943), which provides percentage 
points for the distribution under consideration in the line n, = 00) the cumulants have been 
worked out to xg, the last being specially computed from its formula given below. The gamma 
ratios are y, = — 0-295, y, = 0-174, ys = — 0-154 and y, = 0-175, and || increases there- 
after at this level of n instead of tending to zero. Approximate percentage points may, 
however, be worked out by using the formulae at the foot of the z table, putting n, = o. 

For small n, we note that 


£(s) = Ep-ptatst- as pt ser r an integer, 
¢(s).(1—2-*) = ptptpt- . +00 
=ntytgt +O ist 2-*C(8,r +4). 
We thus get, for n = 2r, 
p= CON ey) —(1 tet gt-t gmap) OD. (3) 


in which the terms in {...} reduce to ¢(s) for n = 2. 
For n = 2r+1 
A’. <8 
K, = (—1}*(s— 1)! {¢(@) (1-2-4) - (+5 +— Bet: + Ga 3y ~3)| (s>1), (4) 
in which the terms in {...} reduce to (8) (1—2-*) for n = i. 
In the special case of s = 1 we have 


1 Th. 1 
n = 2r pee PSE 
ue be a, Os, ES 1). ? 
n= 2r+ — Hy +log 2m) +{1+5+5 a (5) 


For n = 2 and 1 respectively these expressions reduce to the first bracket. €(s) can be got 
from tables, and in particular 
C(2m) = 22-1 72mB_ /(2m)!, 











172 The cumulants of the z and of the logarithmic x? and t distributions 
where the B’s are the Bernoulli numbers B, = 4, B, = gy, B; = dy, By = Hy, Bs = Fe, ete. 
y is Euler’s constent. For reference we may quote: 


y = 057721 56649, £(4) = 1-08232 32337, 
£(2) = 1-64493 40668, ¢(5) = 1-03692 77551, 
£(3) = 1-20205 69032, ¢(6) = 1-01734 30620. 


1 1 
Note that Lt, (£6, a)— =)- y+ z (= - ia) , (R(a@)>09). 
For large n, asymptotic formulae for the Zeta-function may be used, and we get 
L$ (-8, 











a~——- 3 jn 
ae 5 JS gut 8 5. Ae... we (6) 
2n 6n* 15n* 63n® 15n® 33n1°  *°” 
(e—2)!_, (e-1)!, 2 & (49 1 B(aj+0-2)! 
0, (=o 2n?-1 one ne j= (29)! n?7-1 sien 


We may note in passing that not only may this general expression for x, be applied to the 
special case s = 1 with the proviso that the first term in that case is dropped, but also that 
K, may be obtained from x, + }4log(4n) by term-by-term differentiation with respect to n, 
and likewise x, from k,, x, from k;, etc., by similar term-by-term differentiation. This follows 
from a property of the Zeta-function. It is therefore not necessary to write down the explicit 
expressions for Kk, Ks, etc., but we may note.that their leading terms are $n-, — }n-?, n-3, 
— 3n-, 12n-, etc., so that the leading terms of y, and y, are —./(2/n) and 4/n respectively, 
while y, is O(n"). More exactly we have, writing n’ = n—1, 


K. a me l Pe oe i 
me 3n’2 15n’4 n’6 


with corresponding expressions for x, K,, etc., obtained by differentiation with respect to 


n’, and 4 
. 2 1 2\ir 
—<—_ =f winnie 
V1 (2 (1- mat (sn i)): ¥,~(-1) (=) 7 
4 4 1 y, 2 
ss w(1- gaat (ja): ha -r (2): 


Finally, if instead of the distribution of }log (y?/n) we are interested in the distribution of 
log (s*), where s? is an estimate of co? based on n degrees of freedom, we have 





log (x?/n) = log s? — log o?, 
and thus for the distribution of log (s?) we have 


202 
K, = log (") + 24 og I'(4n) 


= tog (22*) — 1.5 {c¢,0)--4] 


and K, = (—1)*(8s—1)! &(s,a) (s>1,a = 4m), 


while the y ratios are the same as for 4 log (x?/n). Obviously log x and logs can be treated 
similarly. See Bartlett & Kendall (1946). 














vw 





ee 





JOHN WISHART 173 


2. THE z DISTRIBUTION 


The distribution of z = } log (s?/s?), where s? and s3 are independent estimates of a variance 
o*, based respectively on v, and v, degrees of freedom, is obviously that of 


$ log (xi/v,) — 4 log (x3/2) 
and its cumulants may therefore be at once derived from those of the logarithmic y? dis- 
tribution. The cumulant generating function is 


K = log M = #itlog (v,/v,) + log "'}(v, + it) + log '}(v, — it) — log I'(4v,) — log I'(40,). 
Further, we have 


he Modi a 
ky = blog;) + 7 log Td) — 3, low Paya) 


= ; {logs + Lt,_,, (€(8, a) — €(s, a,))| (7) 


kK, = 2-*(8— 1)! {E(8, 9) + (- 1)* €(8, @,)} (8> 1, a, = 3%, g = $7). 
For computing purposes these may be thrown into the forms 
2k, = log (v_/ry) + ¥(4¥s) — W(4%2), 
2K, = We-Y(3r:) + (— 1° Y*-M(hr_) (8 > 1). (8) 
To illustrate, let us take v, = 24, v, = 60. We then have from the Polygamma tables (except 
for x,, which was specially computed): 
k, = —0-0127 429, x, = —0-0007 998, x, = —0-0000 104. 
Ke= 0-0301 992, x,= 0-0000 867, x,= 0-0000 019, 
o = Jk, = 0-1737 792, 
Y; = —0-152, y, = 0-095 (or £, = 0-023, 8, = 3:095), indicating the degree and nature of 
the departure from normality. y, and y, are — 0-066 and-0-067 respectively. 

If as a first approximation we assume that for v, and v, of the order of the numbers chosen 
in this example, or higher, z is distributed normally with mean and variance given by the 
above formulae, we obtain approximate percentage points, e.g. for the 95 and 5% points 
we can subtract and add 1-64490 from and to the mean. The result in the present case is to 
give us — 0-299 and 0-273, the correct values being — 0-306 and 0-265. The approximation is 
adequate to almost two figure accuracy, and is evidently useful when we only require to 
know whether an observed z is significant or not. A better approximation is provided by 
the formulae attached to the z tables (see Fisher & Yates (1943)), which yield — 0-3045 and 
0-2653 as against the correct values of — 0-3055 (see Thompson (1941)) and 0-2654. 

Explicit algebraic expressions are readily written down for the cumulants for small v, 
and v,, using the same method as for the logarithmic x? distribution. Whore it is necessary 
to do so, v, will be assumed less than v,. In the contrary case we need only interchange v, 
and v2, changing the sign of the odd cumulantsin so doing. The odd cumulants are zero when 
V; = Vg. We have 

Even cumulants (r = 28, 8>0) 

v, = 2p, ve = 2q 


1 1 1 at 1 1 
K, = 2(r— yf ge2 -(5+3+ totes aaa tact (9) 








174. The cumulants of the z and of the logarithmic y* and t distributions 
Vv, = 2p+1, vy, = 2qg+1 


| ee: 1 1/1 1 1 
k, = 2Ar— nf aa-2)-(F+5+ +o) slat aaeapt +o) |: (10) 


Vv; = ¥g. Drop out the last bracket of terms in the above two cases. 
= 2p, vg = 2q+1 


1 1 ae | 1 ) 
males —}}. 11 
=(r- »{{ 7) (Stet +o ~y) (4+5+5+- oye, +5) | (11) 
v, = 2p+1, v, = 2g. Interchange vy, and v, in this last case. 


Odd cumulants (r = 28+1, s>0) 
V, = 2p, vy = 2g, or vy = 2p+1, vg = 2¢+1 
1 1 
a Oe PA Ce ee RY Geen 12 
coo Gap toa ii 


Vv, = 2p, vg = 2¢+1 


1 1 sa 1 
—I)! —Qi-+ ws ———-}}. (1 
= (r—1) [cera 2 \+(S+5t- + aay) (+5+5+- th, +>) | (13) 
= 2p+1, vg = 2g. Interchange y, and », in this last case, and change the sign of k,. 
In the special case of s = 1, we have 
Vv, = 2p, Vv, = 2g, or v, = 2+], vy = 2+] 


‘. "2 am. po. oe ) 14 
aie, (+ 5+ ta : 14) 
Vv, = 2p, y= edith 
1 1 es | ee 1 


Vv, = 2p+l, ve = 2g. TUE v, and v, in zh last case and change the sign of x,. 
For large v, and v,, a combination of the asymptotic formulae already given readily yields 


the following results: 1/l 1 © (—4-1B l 
K~3(>->)+ 5 1(-w) 
2\v. 4) 521 j vei pi 


the numerical coefficients being as for the x, of } log (x?/n), 


nn eG seem can 


(4) B(2j+8—2)!( 1 (-1) 
he ap gre): 
We may put s = 1 in x, provided we drop the first term. We note also that x, and higher 
cumulants can be written down immediately by differentiating the terms in vy, and v, of 
— 4 log (v,/v,) successively with respect to —v, and v, respectively. 











(16) 


These are the results given by Cornish & Fisher (1937), whose formulae can be extended . 


at sight by means of the results of this paper. A first approximation not only gives the 
familiar results 1/1 1 "Fae 
analinn) ®~alati) 


but also the more general K.~ (s—2)! (as ons =) 








’ 2 


ms “> yy} 





— 











Cn = 


2) 








JOHN WISHART . 175 
but it should be noted that for all s > 1 a second approximation, which takes in an additional 
term, is a tS l (—1)" ) 

s~ 2 _\o,-n=10,-)= 
The accuracy of the asymptotic approximation at the limits of the z table given by Fisher 
& Yates (1943) can be seen by applving it to our example (v, = 24, vy, = 60). The numbers 
of terms which are significant in the eighth place (needed for final accuracy to 7 decimal 


places), are three for x,, four for x, and K,, and three for k,, K; and Kg. The first term for Kg, 
namely 12(vy5+v;5), yields 0-0000 015, rather more than 20 % too low. To use 


12{(v_— 1)-5 + (v, — 1)-} 
would give 0-0000 019, about 2 % too high. 


Should y, or v, be “ moderate in size, the other being large, we may make use of the 
relation 





1 1 
(8.9) = ost ayip? * @+r—iy 


where a is one-half of Hs smaller of v, or vg, to convert our formulae into forms in which 
asymptotic expansions may be applied to both of the Zeta-functions. We then have (v, < v,): 


1=5[ be (7 *) + Lt, {E(s, bye) —£l6, .+n}]- (4+ 545t-+ a3): 


K, = (s- nf 2, by_) + (— 1) (8, 40, +17)} + (- mat fa, aN wet ory} 


+{€(s,a+r), (ran integer), 





(vy, +2)* (v, + 2r—2)° 
(17) 
1 1 2 (-1)9°B,(2j+8-2)! 
wie $(8,)~ Gynt ant t (e—1)int 2, (yn 


Particular cases of some interest arise (i) when r = }(v,—,), v, and v, being either both odd 
or both even, and (ii) when r = $(v,—v, +1), », (« ~.\ being even and », (or v,) odd. In the 
former case the first term within squared brack. «x x, is 2-*(1+(—1)*)€(s, $v2), which is 
zero when s is odd and 2!~*{(s, $v.) when s is even. In the latter we have 


2-*{E(6, ba) + (—1)°£(8, (M2 + 1))} 
which is ¢(s, vy.) when s is even. With s odd we are concerned with the difference of two Zeta- 
functions in which the a’s differ by one-half, and the expression may be written 
©(-1f 1 /d\2(-1) 
PASS); ~ (8-1)! (z,) j=0 Vet) 














— 9s 1 yv,-1 
and 5 (ol -(7 ad 
j=0 Vat) 9 l+z 
© 7j\(v,—1)! _ , 
a See. tion b rts 
2 FAy, +5)! on integration by pa 
1, 5 (ene 
2v_ j=1 2j vi 


on expansion in powers of vy!. This asymptotic expansion is an interesting one in which the 
early coefficients are very simple, for the series is 


1 1 ] 1 17 
ok 








176 The cumulants of the z and of the logarithmic x? and t distributions 
The various cases are set out below: 

Even cumulants (r = 28, s>0) 
Vv, = 2p, v, = 2g, or vy, = 2p+l1, vy, = 2g+1 


= (r+ : +. Ler : 









































(v+2y Vg—2) 
em (r—1)!_ 1 & (=49 B(2j+r—2)! as) 
i es TS (2) !vy** 
v, = 2p, vg = 2qQ+1 
1 
en 0 Gta atte 
(r= 2)! =D! 1S (- 1 Bij +r—2)! (20) 
4 24% “Majmi (29)! vy : 
Odd cumulants (r = 28+1, 8>0) 
V, = 2p, ve = 2q, or », = 2p+1, vz = 2g+1 
1 1 1 
am = (pe Th 4 edema 
oa OE Gp ayt ta aRL si 
vy = 2p, Ve = 2gt+1 
1 1 
«+ ! 
6-0 GaP 
(r—1)! 1 & (—1)4(2%— 1) B(2j+r-2)! 
am He ary ea: 
In the special case of 8 = 1, we have 
= 2p, v, = 2g, or v, = 2p+1, vg = 2¢+1 
Meee V\ uJ 1 1 ) 
x, = dog (72) C+ 3+ + os . 
v, = 2p, vg = 2q+1 
$4 oe 1 1. &(-14(2-1)B, 
K, = log (72) ' Cat tot to, t+ & Try : (23) 


3. THE LOGARITHMIC ¢-DISTRIBUTION 


When »v, = 1. z=log |t|, and we thus have as a special case for the distribution of log |t| for 
ve = n degrees of freedom: 


2x, = logn + Lt,_,, {£(s, $n) — £(s, })} (24) 
= logn + (3)—p(4n) (¥(4) = —y—2 log 2). (25) 
For small n 
n = 2p kK, = }logn—log 2-— (G+4+- -+ <3): 
2 1 ! 
n = 2p+1 Ky = Hogn— (145454... +25). (26) 
For Biba 
j- 
k,~ — Hy + log2) +5 “ag oe. 


(27) 


jn4 








SS 
SE en 








9) 


2) 


3) 








JOHN WISHART 177 
Also =. 2%, = (s— 1)! {C(s, 4n) + (—1)° (8, 4)} (8>1), (28) 
= W*-%(4)+(— 1) Y*-*(4n) (y*-9(3) = (— 1) (— 1)! (28-1) &(a)). (29) 


For small n we have the following cases: 








Even cumulants (r = 28, s>0) 
1 1 
n = 2p k= r-1!{e¢r)- (S+st- +o )}: (30) 
n=2pt+l K, = (=I)! {26(r) (1-27) -(14 545+. +7—~5)}- 
Odd cumulants (r = 28+ 1, s>0) 
1 1 
n = 2p = (I) {eir (1-2) + (545+. ‘+ Gro ¥)}: (31) 
n = 2p+1 K,=—(r—- Lt E+ yt ‘+G-oF “xy 
For large n 
—2)! oath ~ j-1 
Ky~ (~ 1 (= 1)! 660) 12-4) 4+ SEP PE Se em, 
(32) 


In the special case of n = 00 we have for the distribution of 2 |x|, where x is a normal 
variable with zero mean and unit standard deviation: 


kK, = —4#y+log2), «, = (—1)*(s—1)!0(s) (1-2), (33) 
as follows also from the case of } log (x?/n) on putting n = 1. 


4. NOTE ON THE x? DISTRIBUTION APPROXIMATION 


Fisher’s result that ,/(2x*) is approximately normally distributed about a mean of ,/(2n — 1) 
with unit variance (n being the number of degrees of freedom) is well known. The demon- 
stration depends on showing that the mean value of x is 


K, = J2T}(n+1)/I'(4n)~J(n—4) for large n 
and that the variance is n—Ki~ }, 


but to this order of approximation it is not possible to show that y, and y, tend to zero with 
increasing n. A formula for the ratio of the two Gamma functions, developed as far as terms 
in n-3 (see Wishart (1925)), gives y,~(2n)-* and y, = O(n-*) (see, for example, Kendall 
(1945)), but owing to the vanishing of the term in n— of y, its leading term has so far not been 
accurately obtained, although the exact (but somewhat complicated) expressions for the 
£, and £, of the distribution of s = ox/,/(n + 1) were given in an editorial in Biometrika (1915), 


10, 522. 
(= 1)/-1 (2% — 1) B, 





Since {y(n + 1)—(4n)} = Zin% , 
by the formula given in § 2, we find on ee va insertion of the appropriate constant. 
that 2 (— 1) (22 act 1)B; 





log 7'}(n + 1)—log I'(4n) = } log (4n)+ ~ 25(2j — 1) n¥— ’ 


(n+) (- 1/4 (24-1)B 
and thus have a 2 Vidnyexp— 7, [1+ = jG—-})2™™ ’}, 





(34) 





Biometrika 34 12 





178 The cumutants of the z and of the logarithmic y* and t distributions 


which can readily be expanded to give the additional terms necessary to enable the cumu- 


lants of x (or of ,/(2x*)) to be worked out (see Johnson & Welch (1939)). Taking ./{}(n — 3)} 
as the first approximation we find 


hin ‘)exe bront(!+ tanta] 


=f "5 2 ‘\(! sy, + O(n"), (35) 


thus providing a second approximation to the ratio of two Gamma functions differing by 
one-half. The cumulants of ,/(2x?) are 


kK, = ¥(2n—-1) (: + iiain=) + O(n-*5), 


(n—1) 
we ee 
Ka bt ant Bans ON) 
1 1 13 
iit gis = -3-5 6 
Ks a5 (!+a5- ami) + OU" ) 7) 
ye 
Kg = oat +5) + O(n-*), 
that pe 142 : + O(n-**) 
wba Ya = J(2n) + on seat) : 


3 3 
Y2= = pa(!+5, =) + O(n-4). 


The Editorial in Biometrika (1915), 10, 523 calls attention in a footnote to ‘Student’s 


approximations for the #, and f, of the sample standard deviation. The above formulae 
show that ‘Student’s’ results should be 


= 3(1 + mat goa) + OW), 


in which n is now the size of the sample. For n = 10 these give values too low by 2 and 5 
respectively in the fourth place of decimals. Practically four-figure accuracy can be attained 
with n as low as 10 if in the terms in n~* we replace 31/8 by 17/4 and 7/8 by 1 


REFERENCES 


Bartiett, M. 8S. & Kenpatt, D. G. (1946). J.R. Statist. Soc. Suppl. 8, 128. 

BritTIsH ASSOCIATION (1931). Mathematical Tables, 1, 42. London: Cambridge University Press. 
CornisH, E. A. & Fisner, R. A. (1937). Rev. Inst. Internat. Statist. 4, 1. 

Davis, H. T. (1933, 1935). Tables of the Higher Mathematical Functions, 1,2. Indiana: Principia Press. 
FisHer, R. A. & Yates, F. (1943). Statistical Tables, 2nd ed. London: Oliver and Boyd. 

Jounson, N. L. & Wetcu, B. L. (1939). Biometrika, 31, 216. 

KENDALL, M. G. (1945). Advanced Theory of Statistics, 1, 2nd ed., §12-7. London: Griffin and Co. 


ParrMAN, E. (1919). Tracts for Computers, no. 1. London: Cambridge University Press. 
Tuompson, C. M. (1941). Biometrika, 32, 168. 
WIsHart, J. (1925). Biometrika, 17, 68. 





us 
Se 











-_ « @ 7% Pe 


i- 


5) 


od 








[ 179 ] 


THE MEANING OF A SIGNIFICANCE LEVEL 
By G. A. BARNARD 


A level of significance is a probability. To say that a given result is significant on the 5% 
level means that some class of events has probability 0-05. Now whatever theory we may 
hold as to the nature of probability, in order to give a statement of probability a precise 
meaning we must refer to some reference class, or set. of data, on which the probability is 
calculated. What is the reference class involved in a level of significance? 

To many people the answer to this question seems simple enough. The reference class 
involved is the set of indefinite (possibly imaginary) repetitions of the experiment which 
gave the result in question. Otherwise put, the data, on which the probability is calculated, 
are the external conditions of the experiment. The following example indicates, however, 
that the meaning of this reference class is not always clear. The example is a modified form 
of one given by Prof. R. A. Fisher in a letter to the author. 

Suppose we have a bag of chrysanthemum seeds, known to give plants having white 
flowers or plants having purple flowers, no other colours being possible. We suspect that 
the proportions of white and purple seeds are equal, and to test this hypothesis we select 
at random ten seeds from the bag, and plant them. Nine of the plants grow to maturity, 
and all of them have white flowers. On what level of significance can we reject the hypothesis 
of equality of proportions? We may assume that white and purple plants are equally viable. 

It would be natural to argue that, if white and purple flowers were equally likely, the 
probability of our result would be 1/2°. If there is no reason to suspect an excess of white 
rather than an excess of purple flowers, we must add to this the probability of getting nine 
purple flowers, which is also 1/2°, giving a total probability of 1/2°. The hypothesis of equality 
of proportions would then be rejected on the 1/256, or the 0-3906 % level of significance. 
But if we did this our reference class would not be the set of indefinite repetitions of the 
experiment, in its ordinary meaning. 

A repetition of the experiment, in its ordinary meaning, would consist of another selection 
of ten seeds from the bag, and their planting and growth. On such another occasion all ten 
plants might grow to maturity, or all or some might die. These possibilities have not been 
taken into account in our calculation of probability, so far. 

To allow for the possible variation in the number of plants which grow, we might lay out 
the set of all possible results of the experiment as in Fig. 1, where n denotes the number 
of plants that grow, and r denotes the excess of white over purple. Thus any point in the figure 
can be referred to uniquely by its co-ordinates (n,7r). If we now introduce a parameter p, 


-to denote the probability (if it exists) tha* .. plant will grow to maturity, given that it has 


been selected, the probability associated with the point (n, r) on the hypothesis of equality 
of proportions of white and purple will be 


! . ! 
W(n,75P) = | vigcn P - PY aia 


and since this is a function of the unknown p, we have a special problem of arranging the 
points (n, r) in order of significance before we can establish a test. The situation in this 
respect is similar to that dealt with in the paper on 2 x 2 tables, printed earlier in this issue 
(Barnard, 1946, pp. 123-38 above). 








Pas 


180 The meaning of a significance level 


Proceeding as in the earlier paper, we notice first that the same level of significance must 
apply to (n, r) as to (n, —r), so that we can confine our further considerations to the upper 
half of the diagram. Now in this half, the transition from (n, r) to (n+1,7+1) means we 
discover that one of the plants which failed to grow in our case, was in fact a white-flowered 
plant. In this case our conviction that there is an excess of white-flowered plants would be 
strengthened, so that (n + 1, r+ 1) would be reckoned more significant than (n,r). Similarly, 
going from (n, r) to (n+ 1, r—1) would mean that a missing plant was found to be purple, 
and this would weaken our belief in an excess of white-flowered plants; consequently, 


10 


RWNHK OR Dwr AH=-180 0 
<-.-> 


— 
cocoon 


012345678 9 10 


n> 


Fig. 1 


(n, r) would be reckoned more significant than (n+1,r—1). Finally, going from (n, r) to 
(n+ 2, r) would mean growing two more plants, one purple and one white, and this would 
increase our tendency to believe in the equality of proportions. Consequently, (n, r) would 
be reckoned more significant than (n+ 2,r). These principles taken together imply that 
points lying north-east, or west, of a given point (n, 7), or between these two directions, 
would be reckoned more significant than (n,r); while, conversely, points lying east to 
south-west (inclusive) from (n, r) would be reckoned less significant than (n, r). The relative 
significance of points lying inside the half-quadrants north-east to east and south-west to 
west would remain undetermined. 

We could now proceed as in the paper (1), building up a test, consistent with the above 
partial ordering, in such a way as to make the significance or otherwise vi our result depend 
as little as possible on any knowledge we may have about the value of p. But we need not 
carry this through for the result we have quoted, since our conditions by themselves require 
that the only points in the diagram which should be reckoned not less significant than our 


result are the points (9, 9), (9, — 9), (10, 10) and (10, — 10). The probability associated with 
these four points is 


. P(9, 9; p) = 2(10p%(1 —p).2-* + po-10) 
= (p/2)°(20—19p), 


the maximum value of which occurs when p = 18/19, and is P,,(9, 9) = 0-002413. Thus on 
this basis we should conclude that our result was significant on the 0-2413 % level. 











on 








G. A. BARNARD : 181 


The difference between the first result, 0-3906 %%, and the second, 0-2413 %, is in practice 
negligible. Somewhat larger differences will be found in other similar cases, however, and 
it seems worth while to try to clarify the cause of the discrepancy. 

Consider three possible causes for the failure of the tenth plant to grow to maturity: 

(1) The bag from which the seed waz taken is known to contain a proportion of dead seeds, 
which are physically indistinguishable from the live ones, and the tenth seed planted 
happened to be one of these. The conditions of growth were such that any live seed planted 
would have grown. 

(2) The tenth plant happened to be attacked by a soil pest, which destroyed it. 

(3) The statistician trod on the tenth plant while running for a bus; otherwise, it would 
have grown. 

If we now consider what would happen in these three cases if the experiment were 
repeated, in case (1) we should be just as uncertain as before how many plants would grow, 
out of those selected. In case (2), we might or might not happen to strike a good year for 
the pest in question, so that we might or might not nave a similar accident recurring. In 
case (3) we should obviously give the statistician firm instructions not to be careless, and 
then we could be reasonably certain that all the plants selected would grow.* 

In the first case, we can suppose that the proportions of white, purple, and dead seeds in 
the bag are, respectively, p,, p,, and 1—(p,+p,); and the purpose of our experiment is to 
test the hypothesis p, = p,. In this case, putting p,+p, = p, we can clearly apply the 
analysis of Fig. 1, and the appropriate level of significance is 0-2413 %. 

In the third case, the situation actually realized is just what it would have been if we had 
warned the statistician beforehand, and then thrown one of the ten seeds back into the bag. 
Thus our effective sample size here is 9, and the appropriate level of significance is 0-3906 %. 

In the second case, the answer depends on our attitude to the set of accidents of which 
the pest is a specimen. If this set of accidents is regarded as a stable set of chance causes 
we may be justified in representing its effect on the growth of our plants by the prob- 
ability p. If, on the other hand, the incidence of such pests undergoes, say, regular cyclical 
fluctuations from year to year, so that its incidence is to some extent predictable, if not 
wholly controllable, then we should not be justified in assuming the existence of a real 
probability corresponding to our parameter p. We should, to be on the safe side, in this case 
allow fcr the possibility that experimental technique might improve in the future, to such 
an extent as to eliminate the possibility of such accidents. Thus, adopting this conservative 
attitude to our results, we should here treat the effective sample size as 9. The repetitions 
of the experiment which we have in mind would then be imaginary repetitions, in which 
experimental technique was supposed to be better than it is now, and we have as much 
control over pests as we have over statisticians. 

The general situation illustrated by this example can be described in terms of the notion 
of ‘isolate’ introduced by Prof. H. Levy (1931). In making an experiment, we try to 
construct an isolate—a system, or part of the world, which we suppose has relatively little 
interaction with the rest of the world, and which, for practical purposes, may be considered 
on its own. This isolate may contain within itself all the systems of chance causes which are 


* It is not suggested that the three cases exhausi the multiplicity of types which might arise in 
practice. As Prof. Pearson has pointed out, if it were not the statistician, but his three-year-old son 


who was the vandal in case (3), we should have here a situation intermediate between our second and 
third instances. 











182 The meaning of « siynificance level 

regarded as affecting, to any practical extent, the results of the experiment. Such is the case 
in (1), where all the chance causes involved in the experiment are supposed given in 
the bag which is the subject of the experiment. Here, then, we are dealing with a ‘good 
isolate’, whose interaction with the rest of the world is reaily negligible, and chance causes 
operate within the isolate. 

In case (3), on the other hand, we are dealing with an imperfect isolate. The outside world, 
in the shape of the statistician, interacts with our isolate to an extent not negligible in 
practice. Fortunately, in this case we are able to construct a smaller isolate, consisting of 
the nine surviving plants, in which the interactions with the outside world are negligible. 
In case (2), there may be some doubt as to what isolate we are discussing. If we regard soil 
pests and such things as included in the isolate, and represent them as a stable set of chance 
causes, then we are entitled to analyse as in case (1); but if the pests are not included in the 
isolate, we should analyse as in case (3). 

Statistical tests are applicable to at least two types of experiment. First, to experiments 
in which the isolate studied contains within itself a system of chance causes which may 
influence the results. And second, to experiments in which the isolate studied is not a ‘good’ 
isolate, and the residual interactions with the rest of the world may affect the results. There 
may also be mixed cases. 

The distinction between the two types may also be brought out in relation to the necessity 
or otherwise of an ‘artificial’ randomization procedure, using random digits or the like. 
In the first type, such an artificial randomization procedure is not strictly necessary; for 
example, with our bag of seeds, the bag itself, and its physically indistinguishable contents, 
forms a perfectly adequate randomizer. We have in this case, as it were, an impermeable 
shield around the system, which prevents any external shocks from affec.ing the system. . 
In the second type of experiment, we need to ensure that the interactions with the outside 
world will not mask the results we are interested in; and if we cannot ensure a practically 
complete separation from the outside world, then the effect of external intereactions must 
be randomized, by a special procedure. The randomization here acts like a shock absorber, 
specially placed around the experiment to distribute external shocks evenly through the 
system. 

In the first type of experiment, the reference class to which the significance level applies 
is in fact the set of indefinite repetitions of the experiment in question. In the second type 
of experiment, the reference class is an ideal set, in which the accidental influences of the 
outside world repeat themselves exactly, while the effect of these accidents on the system 
varies as a result of the special randomization. 


REFERENCES 


BARNARD, G. A. (1946). Significance tests for 2 x 2 tables. Biometrika, 34, 123. 
Levy, H. (1931). The Universe of Science. London: Watts and Co. 





ase 
in 
90d 


Ses 


rid, 
> in 
y of 
ble. 
soil 
nce 
the 


nts 
nay 
od” 
ere 


sity 
ike. 

for 
nts, 
able 


em. - 


side 
ally 
nust 
‘ber, 

the 


plies 
type 
' the 
stem 








(All Rights reserved) 


BIOMETRIKA. Vol. XXXIV, Parts I and If 


CONTENTS 


The variance of the overlap of geometrical figures with reference to a bombing problem. By 
F. GARWooD . ‘ : 


A study of a first dynasty series of Egyptian skulls from Sakkara and of an eleventh “vee 
series from Thebes. By A. Barrawi and G. M. Morant “ ; 


The generalization of ‘Student’s’ problem when several] different population variances are 
involved. By B. L. Wetcs. 


The distribution of Kendall’s 7 coefficient of rank correlation in rankings containing ties. By 
G. P. Stmuurrro 


The use of range in place of standard deviation in the t-test. By E. Lorp 


The frequency: distribution of 4/6, for samples of all sizes drawn at random from a normal 
population. By R. C. Grary 


On the computation of universal moments of tests of statistical normality derived from samples 
drawn at random from a normal universe. Application to the calculation of the seventh 
moment of b,. By R. C. Geary and J. P.G. WoRLLEDGE . s : . ; -  98—110 


The asymptotical distribution of range in samples from a normal population. By G. Eurvine 111—119 
Limits of the ratio of mean range to standard deviation. By R. L. Puacketr . : - 120—122 
Significance tests for 2x 2 tables. By G. A. Barnazp. ‘ : : . ; $ - 123—138 


The choice of statistical tests illustrated on the ai dines of data classed in a 2 x 2 table. 
By E. 8S. PEarson é : 


139—167 | 
2x2 tables. A note on E. S. Pearson’s paper. By G.A.BaRNaRD . . : ; - 168—169 | 
The cumulants of the z and of the logarithmic y* and ¢ distributions. By J. WisHarT . - 170—178 
The meaning of a significance level. By G.A. BARNARD . . . . « = «  « 179-182) 





A volume of Biometrika containing about 400 pages, with plates and tables, is normally issued annually, and it is hoped 
that the inevitable delay which has occurred under war-time conditions will s00n be overcome. 
Papers for publication should either be sent to 
PROFESSOEF E. 8. PEARSON, Depzrtment of Statistics, University College, London, W.C. 1, 
or if more convenient may be submitted through a member of the Editorial Committee, viz. 
Prorressor Haracp Cramér, University of Stockholm, Sweden. 
Dr R. C. Geary, Statistics Branch, Department of Industry and Commerce, Dublin. 
Proressor M. GrEENwoop, F.R.S.,‘London School of Hygiene and Tropical Medicine, London, W.C. 1. 
Proressor J. B.S. Hatpane, F.R.S., University College, London, W.C. 1. 
Dr G. M. Morant, R.A.F. Institute of Aviation Medicine, R.A.F. Station, Farnborough, Hants. 
Dr Jonn Wisxart, School of Agriculture, Cambridge. 
It is a condition of publication in Biometrika that the paper shall not already have been issued elsewhere, and will not be 
reprinted without leave of the Editors. 
Contributors receive 25 copies of their papers free. Joint authors 15 copies each. 
The subscription price, payable in ad , is Inland 45s. net per volume and Abroad 54s. net (including packing and 
postage). Owing to the scarcity of early volumes, the following rates must now be charged for complete sets. Vcls, I—X X XIII, J 
including XX*: £134. 10s. in buckram, £123 in wrappers, not including postage. At present certain volumes are out of print. § 
Recent volumes may still be obtained at the wrapper price; this is 64s. inland, including postage. Index to Vols. I to V, | 
2s. net. Index to Vols. I to XV, 5s. net. Cheques must be made payable to Biometrika, crossed “‘a/c Biometrika T: 
and sent to The Secretary, Biometrika Office, Department of Statistics, University College, London, W.C.1, to whom all 


orders for series, single copies and offprints should be addressed. Ali foreign cheques must be drawn in sterling and 
on a Bank having a London Agency. 





First printed in Great Britain at the University Press, Cambridge 
Reprinted by offeet-litho by Percy Lund Humphries & Co., Ltd. 





